18 research outputs found

    Composite likelihood inference in a discrete latent variable model for two-way "clustering-by-segmentation" problems

    Full text link
    We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g. exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g. consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated by full maximum likelihood. We therefore introduce composite likelihood methodology based on considering different subsets of the data. The proposed approach is illustrated by simulation, and with an application to genomic data

    Segmenting the human genome based on states of neutral genetic divergence.

    No full text
    Many studies have demonstrated that divergence levels generated by different mutation types vary and covary across the human genome. To improve our still-incomplete understanding of the mechanistic basis of this phenomenon, we analyze several mutation types simultaneously, anchoring their variation to specific regions of the genome. Using hidden Markov models on insertion, deletion, nucleotide substitution, and microsatellite divergence estimates inferred from human-orangutan alignments of neutrally evolving genomic sequences, we segment the human genome into regions corresponding to different divergence states--each uniquely characterized by specific combinations of divergence levels. We then parsed the mutagenic contributions of various biochemical processes associating divergence states with a broad range of genomic landscape features. We find that high divergence states inhabit guanine- and cytosine (GC)-rich, highly recombining subtelomeric regions; low divergence states cover inner parts of autosomes; chromosome X forms its own state with lowest divergence; and a state of elevated microsatellite mutability is interspersed across the genome. These general trends are mirrored in human diversity data from the 1000 Genomes Project, and departures from them highlight the evolutionary history of primate chromosomes. We also find that genes and noncoding functional marks [annotations from the Encyclopedia of DNA Elements (ENCODE)] are concentrated in high divergence states. Our results provide a powerful tool for biomedical data analysis: segmentations can be used to screen personal genome variants--including those associated with cancer and other diseases--and to improve computational predictions of noncoding functional elements. Proc Natl Acad Sci U S A 2013 Sep 3; 110(36):14699-704

    Mixture-based path clustering for synthesis of ECMWF ensemble forecasts of tropical cyclone evolution

    No full text
    In this article, three tropical cyclones and their 120-h, 50-member ECMWF Integrated Forecasting System (IFS) ensemble track forecasts at 10 initialization times are considered. The IFS forecast tracks are clustered with a regression mixture model, and two traditional diagnostics (the Bayesian information criterion and a measure of strength of cluster assignment) are used to determine the optimal polynomial order and number of clusters to use in the model. In addition, cross-validation versions of the two diagnostics are formulated and computed to further aid in model selection. Both traditional and cross-validation diagnostics suggest that third-order polynomials and five clusters are effective options-although the evidence is less conclusive for the number of clusters than for the polynomial order, and the cross-validation diagnostics favor a smaller number of clusters than the traditional ones. Path clustering of IFS tropical cyclone track forecasts with this third-order polynomial, five-cluster regression mixture model produces interpretable partitions by direction and speed of motion for each of the storms and initialization times considered. Thus, this approach effectively synthesizes the forecast spreads within the IFS into a small number of representative trajectories. Based on how forecasts distribute across clusters, this approach also provides information on the likelihood of each such representative trajectory. If used operationally, this information has the potential to aid forecasters in parsing and quantifying the uncertainty in tropical cyclone track forecasts

    S1 File -

    No full text
    IntroductionVaccine hesitancy during the COVID-19 pandemic impacted many higher education institutions. Understanding the factors associated with vaccine hesitancy and uptake is instrumental in directing policies and disseminating reliable information during public health emergencies.ObjectiveThis study evaluates associations between age, gender, and political leaning in relationship to COVID-19 vaccination status among a large, multi-campus, public university in Pennsylvania.MethodsFrom October 5—November 30, 2021, a 10-minute REDCap survey was available to students, faculty, and staff 18 years of age and older at the Pennsylvania State University (PSU). Recruitment included targeted email, social media, digital advertisements, and university newspapers. 4,231 responses were received. Associations between the selected factors and vaccine hesitancy were made with Chi-square tests and generalized linear regression models using R version 4.3.1 (2023-06-16).ResultsLogistic regression approach suggested that age and political leaning have a statistically significant association with vaccine hesitancy at the 5% level. Adjusted for political leaning, odds of being vaccinated is 4 times higher for those aged 56 years or older compared to the ones aged 18 to 20 (OR = 4.35, 95% CI = (2.82, 6.85), p-value ConclusionsAge and political leaning are key predictors of vaccine uptake among members of the PSU community, knowledge of which may inform campus leadership’s public health efforts such as vaccine campaigns and policy decisions.</div

    History of a COVID-19-positive test and vaccine statuses for COVID-19 and influenza.

    No full text
    History of a COVID-19-positive test and vaccine statuses for COVID-19 and influenza.</p
    corecore