12 research outputs found

    Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data

    Get PDF
    Objective: To develop a conceptual prediction model framework containing standardized steps and describe the corresponding open-source software developed to consistently implement the framework across computational environments and observational healthcare databases to enable model sharing and reproducibility. Methods: Based on existing best practices we propose a 5 step standardized framework for: (1) transparently defining the problem; (2) selecting suitable datasets; (3) constructing variables from the observational data; (4) learning the predictive model; and (5) validating the model performance. We implemented this framework as open-source software utilizing the Observational Medical Outcomes Partnership Common Data Model to enable convenient sharing of models and reproduction of model evaluation across multiple observational datasets. The software implementation contains default covariates and classifiers but the framework enables customization and extension. Results: As a proof-of-concept, demonstrating the transparency and ease of model dissemination using the software, we developed prediction models for 21 different outcomes within a target population of people suffering from depression across 4 observational databases. All 84 models are available in an accessible online repository to be implemented by anyone with access to an observational database in the Common DataModel format. Conclusions: The proof-of-concept study illustrates the framework's ability to develop reproducible models that can be readily shared and offers the potential to perform extensive external validation of models, and improve their likelihood of clinical uptake. In future work the framework will be applied to perform an "all-by-all" prediction analysis to assess the observational data prediction domain across numerous target populations, outcomes and time, and risk settings

    Bayesian inference reveals host-specific contributions to the epidemic expansion of influenza A H5N1

    No full text
    Since its first isolation in 1996 in Guangdong, China, the highly pathogenic avian influenza virus (HPAIV) H5N1 has circulated in avian hosts for almost two decades and spread to more than 60 countries worldwide. The role of different avian hosts and the domestic-wild bird interface has been critical in shaping the complex HPAIV H5N1 disease ecology, but remains difficult to ascertain. To shed light on the large-scale H5N1 transmission patterns and disentangle the contributions of different avian hosts on the tempo and mode of HPAIV H5N1 dispersal, we apply Bayesian evolutionary inference techniques to comprehensive sets of hemagglutinin and neuraminidase gene sequences sampled between 1996 and 2011 throughout Asia and Russia. Our analyses demonstrate that the large-scale H5N1 transmission dynamics are structured according to different avian flyways, and that the incursion of the Central Asian flyway specifically was driven by Anatidae hosts coinciding with rapid rate of spread and an epidemic wavefront acceleration. This also resulted in longdistance dispersal that is likely to be explained by wild bird migration. We identify a significant degree of asymmetry in the large-scale transmission dynamics between Anatidae and Phasianidae, with the latter largely representing poultry as an evolutionary sink. A joint analysis of host dynamics and continuous spatial diffusion demonstrates that the rate of viral dispersal and host diffusivity is significantly higher for Anatidae compared with Phasianidae. These findings complement risk modeling studies and satellite tracking of wild birds in demonstrating a continental-scale structuring into areas of H5N1 persistence that are connected through migratory waterfowl.SCOPUS: ar.jinfo:eu-repo/semantics/publishe

    Interpreting observational studies: Why empirical calibration is needed to correct p-values

    Get PDF
    Often the literature makes assertions of medical product effects on the basis of ' p<0.05'. The underlying premise is that at this threshold, there is only a 5% probability that the observed effect would be seen by chance when in reality there is no effect. In observational studies, much more than in randomized trials, bias and confounding may undermine this premise. To test this premise, we selected three exemplar drug safety studies from literature, representing a case-control, a cohort, and a self-controlled case series design. We attempted to replicate these studies as best we could for the drugs studied in the original articles. Next, we applied the same three designs to sets of negative controls: drugs that are not believed to cause the outcome of interest. We observed how often p<0.05 when the null hypothesis is true, and we fitted distributions to the effect estimates. Using these distributions, we compute calibrated p-values that reflect the probability of observing the effect estimate under the null hypothesis, taking both random and systematic error into account. An automated analysis of scientific literature was performed to evaluate the potential impact of such a calibration. Our experiment provides evidence that the majority of observational studies would declare statistical significance when no effect is present. Empirical calibration was found to reduce spurious results to the desired 5% level. Applying these adjustments to literature suggests that at least 54% of findings with p<0.05 are not actually statistically significant and should be reevaluated

    SPREAD 4: online visualisation of pathogen phylogeographic reconstructions

    No full text
    Phylogeographic analyses aim to extract information about pathogen spread from genomic data, and visualising spatio-temporal reconstructions is a key aspect of this process. Here we present SPREAD 4, a feature-rich web-based application that visualises estimates of pathogen dispersal resulting from Bayesian phylogeographic inference using BEAST on a geographic map, offering zoom-and-filter functionality and smooth animation over time. SPREAD 4 takes as input phylogenies with both discrete and continuous location annotation and offers customised visualisation as well as generation of publication-ready figures. SPREAD 4 now features account-based storage and easy sharing of visualisations by means of unique web addresses. SPREAD 4 is intuitive to use and is available online at https://spreadviz.org, with an accompanying web page containing answers to frequently asked questions at https://beast.community/spread4.SCOPUS: ar.jinfo:eu-repo/semantics/publishe

    Dispersal history and bidirectional human-fish host switching of invasive, hypervirulent Streptococcus agalactiae sequence type 283

    No full text
    Human group B Streptococcus (GBS) infections attributable to an invasive, hypervirulent sequence type (ST) 283 have been associated with freshwater fish consumption in Asia. The origin, geographic dispersion pathways and host transitions of GBS ST283 remain unresolved. We gather 328 ST283 isolate whole-genome sequences collected from humans and fish between 1998 and 2021, representing eleven countries across four continents. We apply Bayesian phylogeographic analyses to reconstruct the dispersal history of ST283 and combine ST283 phylogenies with genetic markers and host association to investigate host switching and the gain and loss of antimicrobial resistance and virulence factor genes. Initial dispersal within Asia followed ST283 emergence in the early 1980s, with Singapore, Thailand and Hong Kong observed as early transmission hubs. Subsequent intercontinental dispersal originating from Vietnam began in the decade commencing 2001, demonstrating ST283 holds potential to expand geographically. Furthermore, we observe bidirectional host switching, with the detection of more frequent human-to-fish than fish-to-human transitions, suggesting that sound wastewater management, hygiene and sanitation may help to interrupt chains of transmission between hosts. We also show that antimicrobial resistance and virulence factor genes were lost more frequently than gained across the evolutionary history of ST283. Our findings highlight the need for enhanced surveillance, clinical awareness, and targeted risk mitigation to limit transmission and reduce the impact of an emerging pathogen associated with a high-growth aquaculture industry.SCOPUS: ar.jinfo:eu-repo/semantics/publishe

    Phylogeography Reveals Association between Swine Trade and the Spread of Porcine Epidemic Diarrhea Virus in China and across the World

    No full text
    The ongoing SARS (severe acute respiratory syndrome)-CoV (coronavirus)-2 pandemic has exposed major gaps in our knowledge on the origin, ecology, evolution, and spread of animal coronaviruses. Porcine epidemic diarrhea virus (PEDV) is a member of the genus Alphacoronavirus in the family Coronaviridae that may have originated from bats and leads to significant hazards and widespread epidemics in the swine population. The role of local and global trade of live swine and swine-related products in disseminating PEDV remains unclear, especially in developing countries with complex swine production systems. Here, we undertake an in-depth phylogeographic analysis of PEDV sequence data (including 247 newly sequenced samples) and employ an extension of this inference framework that enables formally testing the contribution of a range of predictor variables to the geographic spread of PEDV. Within China, the provinces of Guangdong and Henan were identified as primary hubs for the spread of PEDV, for which we estimate live swine trade to play a very important role. On a global scale, the United States and China maintain the highest number of PEDV lineages. We estimate that, after an initial introduction out of China, the United States acted as an important source of PEDV introductions into Japan, Korea, China, and Mexico. Live swine trade also explains the dispersal of PEDV on a global scale. Given the increasingly global trade of live swine, our findings have important implications for designing prevention and containment measures to combat a wide range of livestock coronaviruses.SCOPUS: ar.jinfo:eu-repo/semantics/publishe

    Genomic epidemiology, evolution, and transmission dynamics of porcine deltacoronavirus

    No full text
    The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has shown once again that coronavirus (CoV) in animals are potential sources for epidemics in humans. Porcine deltacoronavirus (PDCoV) is an emerging enteropathogen of swine with a worldwide distribution. Here, we implemented and described an approach to analyze the epidemiology of PDCoV following its emergence in the pig population. We performed an integrated analysis of full genome sequence data from 21 newly sequenced viruses, along with comprehensive epidemiological surveillance data collected globally over the last 15 years. We found four distinct phylogenetic lineages of PDCoV, which differ in their geographic circulation patterns. Interestingly, we identified more frequent intra- and interlineage recombination and higher virus genetic diversity in the Chinese lineages compared with the USA lineage where pigs are raised in different farming systems and ecological environments. Most recombination breakpoints are located in the ORF1ab gene rather than in genes encoding structural proteins. We also identified five amino acids under positive selection in the spike protein suggesting a role for adaptive evolution. According to structural mapping, three positively selected sites are located in the N-terminal domain of the S1 subunit, which is the most likely involved in binding to a carbohydrate receptor, whereas the other two are located in or near the fusion peptide of the S2 subunit and thus might affect membrane fusion. Finally, our phylogeographic investigations highlighted notable South-North transmission as well as frequent long-distance dispersal events in China that could implicate human-mediated transmission. Our findings provide new insights into the evolution and dispersal of PDCoV that contribute to our understanding of the critical factors involved in CoVs emergence.SCOPUS: ar.jinfo:eu-repo/semantics/publishe

    Predicting the evolution of the Lassa virus endemic area and population at risk over the next decades

    No full text
    Lassa fever is a severe viral hemorrhagic fever caused by a zoonotic virus that repeatedly spills over to humans from its rodent reservoirs. It is currently not known how climate and land use changes could affect the endemic area of this virus, currently limited to parts of West Africa. By exploring the environmental data associated with virus occurrence using ecological niche modelling, we show how temperature, precipitation and the presence of pastures determine ecological suitability for virus circulation. Based on projections of climate, land use, and population changes, we find that regions in Central and East Africa will likely become suitable for Lassa virus over the next decades and estimate that the total population living in ecological conditions that are suitable for Lassa virus circulation may drastically increase by 2070. By analysing geotagged viral genomes using spatially-explicit phylogeography and simulating virus dispersal, we find that in the event of Lassa virus being introduced into a new suitable region, its spread might remain spatially limited over the first decades.SCOPUS: ar.jinfo:eu-repo/semantics/publishe

    Comparative safety and effectiveness of alendronate versus raloxifene in women with osteoporosis

    Get PDF
    Alendronate and raloxifene are among the most popular anti-osteoporosis medications. However, there is a lack of head-to-head comparative effectiveness studies comparing the two treatments. We conducted a retrospective large-scale multicenter study encompassing over 300 million patients across nine databases encoded in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). The primary outcome was the incidence of osteoporotic hip fracture, while secondary outcomes were vertebral fracture, atypical femoral fracture (AFF), osteonecrosis of the jaw (ONJ), and esophageal cancer. We used propensity score trimming and stratification based on an expansive propensity score model with all pre-treatment patient characteritistcs. We accounted for unmeasured confounding using negative control outcomes to estimate and adjust for residual systematic bias in each data source. We identified 283,586 alendronate patients and 40,463 raloxifene patients. There were 7.48 hip fracture, 8.18 vertebral fracture, 1.14 AFF, 0.21 esophageal cancer and 0.09 ONJ events per 1,000 person-years in the alendronate cohort and 6.62, 7.36, 0.69, 0.22 and 0.06 events per 1,000 person-years, respectively, in the raloxifene cohort. Alendronate and raloxifene have a similar hip fracture risk (hazard ratio [HR] 1.03, 95% confidence interval [CI] 0.94–1.13), but alendronate users are more likely to have vertebral fractures (HR 1.07, 95% CI 1.01–1.14). Alendronate has higher risk for AFF (HR 1.51, 95% CI 1.23–1.84) but similar risk for esophageal cancer (HR 0.95, 95% CI 0.53–1.70), and ONJ (HR 1.62, 95% CI 0.78–3.34). We demonstrated substantial control of measured confounding by propensity score adjustment, and minimal residual systematic bias through negative control experiments, lending credibility to our effect estimates. Raloxifene is as effective as alendronate and may remain an option in the prevention of osteoporotic fracture
    corecore