11 research outputs found

    Effet de l'échantillonnage non proportionnel de cas et de témoins sur une méthode de vraisemblance maximale pour l'estimation de la position d'une mutation sous sélection

    Full text link
    Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal

    DM-PhyClus: A Bayesian phylogenetic algorithm for infectious disease transmission cluster inference

    Full text link
    Background. Conventional phylogenetic clustering approaches rely on arbitrary cutpoints applied a posteriori to phylogenetic estimates. Although in practice, Bayesian and bootstrap-based clustering tend to lead to similar estimates, they often produce conflicting measures of confidence in clusters. The current study proposes a new Bayesian phylogenetic clustering algorithm, which we refer to as DM-PhyClus, that identifies sets of sequences resulting from quick transmission chains, thus yielding easily-interpretable clusters, without using any ad hoc distance or confidence requirement. Results. Simulations reveal that DM-PhyClus can outperform conventional clustering methods, as well as the Gap procedure, a pure distance-based algorithm, in terms of mean cluster recovery. We apply DM-PhyClus to a sample of real HIV-1 sequences, producing a set of clusters whose inference is in line with the conclusions of a previous thorough analysis. Conclusions. DM-PhyClus, by eliminating the need for cutpoints and producing sensible inference for cluster configurations, can facilitate transmission cluster detection. Future efforts to reduce incidence of infectious diseases, like HIV-1, will need reliable estimates of transmission clusters. It follows that algorithms like DM-PhyClus could serve to better inform public health strategies

    Challenges in the prediction of motor vehicle traffic collisions with GPS travel data

    No full text
    In the field of road safety, crashes involving physical injuries typically occur on roadways, which constrain the events to lie along a linear network. Substantial research efforts have been devoted to the development of methods for point patterns on linear networks. In one such model, we assume that crash coordinates are produced by a Poisson point process whose domain corresponds to edges in the road network. This talk focuses on the analysis of geo-localised accident data in the context of a smart city initiative launched by the City of Quebec (Canada) aiming to identify crash hotspots on the road network based on covariates derived from GPS data. Data originate from three sources: i) a geolocalised traffic accident database whose entries are based on police reports, ii) GPS trajectories obtained from a study on 4,000 drivers involving 55,000 trips and iii) the structure of the road network obtained from the OpenStreetMap (OSM) database. We highlight challenges, both methodological and computational, with the use of those three data sources in producing sensible inference for the covariate effects.Non UBCUnreviewedAuthor affiliation: HEC MontréalPostdoctora

    DM-PhyClus: A Bayesian phylogenetic algorithm for infectious disease transmission cluster inference

    No full text
    The files include the simulated datasets and the chain results for the simulation study described in, DM-PhyClus: A Bayesian phylogenetic algorithm for infectious disease transmission cluster inference Those are all compressed R data files (gzip format)

    An investigation of disease transmission clusters with Bayesian phylogenetic clustering methods

    No full text
    A number of studies have investigated transmission clusters in the Human Immunodeficiency Virus (HIV) epidemic among Men who have Sex with Men (MSM) in the province of Québec, Canada, stressing the contribution of clusters to incidence. Studies of that type usually rely on a sample of HIV-1 genetic sequences, whose ancestry is inferred with a phylogenetic model, yielding a tree used to partition the sample. Understanding of clusters found through phylogenetic analyses is still limited, which is reflected in the many ad hoc criteria used in their estimation. This manuscript-based thesis aims to improve understanding of phylogenetic clusters, propose improvements to transmission cluster inference methods, and provide updated estimates of HIV-1 transmission clusters in Québec, by means of a thorough comparison between results from conventional approaches and the new method.The first manuscript in the thesis addresses the issue of phylogenetic cluster interpretation. Through simulations of epidemics on several categories of contact networks, we explore the association between phylogenetic clusters, found under a variety of distance-based clustering methods, and communities, distinctive groups of densely-connected individuals in the network. We find limited overlap between clusters and communities, suggesting that a network interpretation of phylogenetic clusters may not be warranted.The second manuscript presents a new phylogenetic clustering algorithm, DM-PhyClus, that readily weaves into cluster inference a clear definition of transmission clusters, resulting in a straightforward interpretation for the inferred clusters. Unlike conventional phylogenetic clustering approaches, the method does not rely on arbitrary genetic distance or clade-confidence cutpoints applied a posteriori to the estimated phylogeny. Simulations reveal that DM-PhyClus can outperform a number of conventional clustering methods in terms of mean cluster recovery. We apply DM-PhyClus to a sample of real HIV-1 sequences obtained from the Québec HIV genotyping program database, revealing a set of clusters whose estimatesare in line with the conclusions of a previous curated analysis.The third manuscript includes a detailed clustering analysis of HIV-1 cases among MSMs based on DNA sequences collected for the Québec HIV genotyping program. We first cluster the data with two conventional approaches, maximum likelihood phylogenetic inference coupled with bootstrap estimation of confidence in clades, and pure Bayesian phylogenetic estimation, under a variety of clustering criteria and cutpoints. We then partition the sample with the help of DM-PhyClus and the Gap Procedure, both approaches aiming to avoid arbitrary selection of cutpoints. The analyses based on conventional methods reveal largely overlapping sets of clusters, while DM-PhyClus and the Gap Procedure propose moderately different partitions. An examination of more recently-diagnosed cases that are known to have been infected at most six months prior to diagnostic shows considerable expansion of large clusters, and hint at the emergence of a few new transmission clusters. The analyses stress the continued importance of clustering in maintaining the HIV epidemic among MSMs, and suggest that the frequency of early transmission events might explain why improvements in antiretroviral therapy have not lead to the end of the epidemic.Plusieurs études se sont penchées sur les grappes de transmission au sein de l’épidemie de VIH-1 parmi les hommes ayant des relations sexuelles avec d’autres hommes (HARSAH) au Québec, mettant en lumière la contribution de ces grappes à l’incidence. Les études de ce type se fient habituellement à un échantillon de séquences génétiques de VIH-1, dont l’histoire ancestrale est inférée grâce à un modèle phylogénétique, produisant un arbre utilisé pour partitionner l’échantillon. La compréhension des grappes trouvées par l’intermédiaire d’une analyse phylogénétique est toutefois limitée, ce qui se traduit par une multitude de critères arbitraires pour leur estimation. La thèse, comportant trois articles, cherche à augmenter la compréhension des grappes phylogénétiques, à proposer des améliorations aux méthodes actuelles d’inférence des grappes de transmission, et à fournir une mise à jour des estimés des grappes de transmission du VIH-1 au Québec.Le premier article de la thèse traite de l’interprétation des grappes phylogénétiques. À l’aide de simulations d’épidémies sur plusieurs catégories de réseaux de contacts, nous explorons l’association entre les grappes phylogénétiques, obtenues par l’application de différentes méthodes de regroupement basées sur la distance, et les communautés, des groupes distinctifs d’individus dans le réseau. Nous remarquons une correspondance limitée entre les grappes et les communautés, et concluons qu’une interprétation des grappes phylogénétiques en termes de la structure du réseau de contacts pourrait être difficile à justifier.Le deuxième article présente un nouvel algorithme de regroupement phylogénétique, DM-PhyClus, qui mêle à l’inférence des grappes une définition claire des grappes de transmission, donnant ainsi une interprétation sans ambiguïté aux grappes inférées. Contrairement aux approches conventionnelles de regroupement, DM-PhyClus ne nécessite pas l’application à la phylogénie inférée de critères arbitraires de distance génétique ou de confiance en les clades obtenues. Les simulations révèlent que DM-PhyClus peut battre les méthodes conventionnelles sur le plan du taux moyen de détection des grappes. Nous appliquons la méthode à un échantillon de séquences véritables de VIH-1 tirées de la base de données du programmequébécois de génotypage du VIH, ce qui révèle un ensemble de grappes très similaires à celles proposées par une étude précédente dont les estimés ont été partiellement validés.Le troisième article inclut une analyse détaillée de regroupement des cas de VIH-1 parmi des HARSAH basée sur les séquences d’ADN collectées dans le cadre du programme québécois de génotypage du VIH. Tout d’abord, nous regroupons les données à l’aide de deux méthodes conventionnelles: l’inférence phylogénétique par maximum de vraisemblance, cou-plée avec l’estimation de la confiance en les clades par le bootstrap, et l’estimation phylogénétique bayésienne pure. Nous partitionnons par la suite l’échantillon à l’aide de DM-PhyClus et du Gap Procedure, deux approches cherchant à éviter la sélection arbitraire de critères de regroupement. Les analyses basées sur les méthodes conventionnelles produisent des estimés de grappes très similaires, tandis que DM-PhyClus et le Gap Procedure proposent des partitions modérément distinctives. Un coup d’oeil aux cas diagnostiqués récemment, et dont la date d’infection se situe au maximum six mois auparavant, met en lumière une expansion considérable des plus grandes grappes de transmission et l’émergence potentielle de quelques nouvelles grappes. Les résultats de l’étude soulignent le rôle persistant des grappes de transmission dans la survie de l’épidémie de VIH parmi les HARSAH. De plus, ils suggèrent que les événements de transmission hâtifs expliqueraient pourquoi les améliorations aux traitements antirétroviraux n’ont pas mené à une résorption de l’épidémie

    Assessing the role of transmission chains in the spread of HIV-1 among men who have sex with men in Quebec, Canada.

    No full text
    BACKGROUND:Phylogenetics has been used to investigate HIV transmission among men who have sex with men. This study compares several methodologies to elucidate the role of transmission chains in the dynamics of HIV spread in Quebec, Canada. METHODS:The Quebec Human Immunodeficiency Virus (HIV) genotyping program database now includes viral sequences from close to 4,000 HIV-positive individuals classified as Men who have Sex with Men (MSMs), collected between 1996 and early 2016. Assessment of chain expansion may depend on the partitioning scheme used, and so, we produce estimates from several methods: the conventional Bayesian and maximum likelihood-bootstrap methods, in combination with a variety of schemes for applying a maximum distance criterion, and two other algorithms, DM-PhyClus, a Bayesian algorithm that produces a measure of uncertainty for proposed partitions, and the Gap Procedure, a fast non-phylogenetic approach. Sequences obtained from individuals in the Primary HIV Infection (PHI) stage serve to identify incident cases. We focus on the period ranging from January 1st 2012 to February 1st 2016. RESULTS AND CONCLUSION:The analyses reveal considerable overlap between chain estimates obtained from conventional methods, thus leading to similar estimates of recent temporal expansion. The Gap Procedure and DM-PhyClus suggest however moderately different chains. Nevertheless, all estimates stress that longer older chains are responsible for a sizeable proportion of the sampled incident cases among MSMs. Curbing the HIV epidemic will require strategies aimed specifically at preventing such growth

    Modeling Fetal Weight for Gestational Age: A Comparison of a Flexible Multi-level Spline-based Model with Other Approaches

    No full text
    We present a model for longitudinal measures of fetal weight as a function of gestational age. We use a linear mixed model, with a Box-Cox transformation of fetal weight values, and restricted cubic splines, in order to flexibly but parsimoniously model median fetal weight. We systematically compare our model to other proposed approaches. All proposed methods are shown to yield similar median estimates, as evidenced by overlapping pointwise confidence bands, except after 40 completed weeks, where our method seems to produce estimates more consistent with observed data. Sex-based stratification affects the estimates of the random effects variance-covariance structure, without significantly changing sex-specific fitted median values. We illustrate the benefits of including sex-gestational age interaction terms in the model over stratification. The comparison leads to the conclusion that the selection of a model for fetal weight for gestational age can be based on the specific goals and configuration of a given study without affecting the precision or value of median estimates for most gestational ages of interest.
    corecore