6 research outputs found

    SPODT: An R Package to Perform Spatial Partitioning

    Get PDF
    International audienceSpatial cluster detection is a classical question in epidemiology: Are cases located near other cases? In order to classify a study area into zones of different risks and determine their boundaries, we have developed a spatial partitioning method based on oblique decision trees, which is called spatial oblique decision tree (SpODT). This non-parametric method is based on the classification and regression tree (CART) approach introduced by Leo Breiman. Applied to epidemiological spatial data, the algorithm recursively searches among the coordinates for a threshold or a boundary between zones, so that the risks estimated in these zones are as different as possible. While the CART algorithm leads to rectangular zones, providing perpendicular splits of longitudes and latitudes, the SpODT algorithm provides oblique splitting of the study area, which is more appropriate and accurate for spatial epidemiology. Oblique decision trees can be considered as non-parametric regression models. Beyond the basic function, we have developed a set of functions that enable extended analyses of spatial data, providing: inference, graphical representations, spatio-temporal analysis, adjustments on covariates, spatial weighted partition, and the gathering of similar adjacent final classes. In this paper, we propose a new R package, SPODT, which provides an extensible set of functions for partitioning spatial and spatio-temporal data. The implementation and extensions of the algorithm are described. Function usage examples are proposed, looking for clustering malaria episodes in Bandiagara, Mali, and samples showing three different cluster shapes

    Methods to analyze net survival : use of life tables, comparison test and spatial cluster detection

    No full text
    La survie nette, indicateur clé de l'efficacité des systèmes de soin dans la lutte contre le cancer, est un concept théorique représentant la survie que l'on observerait dans un monde hypothétique où le cancer étudié serait la seule cause de décès. En s'affranchissant de la mortalité due aux causes autres que ce cancer, elle permet des comparaisons entre populations. Dans cette thèse, après présentation du concept et des méthodes d'estimation de la survie nette quand la cause de décès est inconnue, nous étudions trois problématiques. La première porte sur les tables de mortalité utilisées pour estimer la survie nette. En France, ces tables sont stratifiées sur âge, sexe, année et département. Il serait intéressant d'utiliser des tables stratifiées sur d'autres facteurs impactant la mortalité. Nous étudions l'impact du manque de stratification sur les estimations des effets des facteurs pronostiques sur la mortalité en excès (celle due au cancer en l'absence des autres causes de décès) par des études de simulations et sur données réelles. La deuxième problématique porte sur la construction d'un test de type log-rank pour comparer des distributions de survie nette estimées par l'estimateur Pohar-Perme, estimateur non paramétrique consistant de la survie nette. Notre troisième problématique est de déterminer dans une aire géographique des zones différentes en termes de survie nette. Nous adaptons une méthode de détection de clusters à la survie nette en utilisant le test précédemment développé comme critère de découpage. Ce travail propose ainsi des développements et outils nouveaux pour étudier et améliorer la qualité de la prise en charge des patients atteints d'un cancer.In cancer research, net survival is a key indicator of health care efficiency. This theoretical concept is the survival that would be observed in an hypothetical world where the disease under study would be the only possible cause of death. In population-based studies, where cause of death is unknown, net survival allows to compare net cancer survival between different groups by removing the effect of death from causes other than cancer. In this work, after presenting the concept and the estimation methods of net survival, we focus on three complementary issues. The first one is about the life tables used in the estimates of net survival. In France, these tables are stratified by age, sex, year and département. Other prognostic factors impact on mortality. So it would be interesting to use life tables stratified by some of these factors. We study the impact of the lack of stratification in life tables on the estimates of the effects of prognostic factors on excess mortality by simulations and real data studies. In 2012, the Pohar-Perme estimator was proposed. It is a consistent non parametric estimator of net survival. The second issue involves the building of a log-rank type test to compare distributions of net survival (estimated by the Pohar-Perme estimator) between several groups. Our third issue is to propose a method providing potential spatial clusters which could contain patients with similar net cancer survival rates. We adapt a clustering method using the test we have built as a splitting criterion. This work proposes new developments and new tools to study and improve the quality of care for cancer patients. These methods are suitable to other chronic diseases

    Modeling time-varying exposure using inverse probability of treatment weights

    No full text
    International audienceFor estimating the causal effect of treatment exposure on the occurrence of adverse events, inverse probability weights (IPW) can be used in marginal structural models to correct for time-dependent confounding. The R package ipw allows IPW estimation by modeling the relationship between the exposure and confounders via several regression models, among which is the Cox model. For right-censored data and time-dependent exposures such as treatment switches, the ipw package allows a single switch, assuming that patients are treated once and for all. However, to accommodate multiple switches, we extend this package by implementing a function that allows for multiple and intermittent exposure status in the estimation of IPW using a survival model. This extension allows for the whole exposure treatment trajectory in the estimation of IPW. The impact of the estimated weights on the estimated causal effect, with both methods, is assessed in a simulation study. Then, the function is illustrated on a real dataset from a nationwide prospective observational cohort including patients with inflammatory bowel disease. In this study, patients received one or multiple medications (thiopurines, methotrexate, and anti-TNF) over time. We used a Cox marginal structural model to assess the effect of thiopurines exposure on the cause-specific hazard for cancer incidence considering other treatments as confounding factors. To this end, we used our extended function which is available online in the Supporting Information. K E Y W O R D S causal inference, inverse probability weighting, marginal structural models, R package, treatment switc

    Correcting inaccurate background mortality in excess hazard models through breakpoints

    No full text
    International audienceBackground: Methods for estimating relative survival are widely used in population-based cancer survival studies. These methods are based on splitting the observed (the overall) mortality into excess mortality (due to cancer) and background mortality (due to other causes, as expected in the general population). The latter is derived from life tables usually stratified by age, sex, and calendar year but not by other covariates (such as the deprivation level or the socioeconomic status) which may lack though they would influence background mortality. The absence of these covariates leads to inaccurate background mortality, thus to biases in estimating the excess mortality. These biases may be avoided by adjusting the background mortality for these covariates whenever available. Methods: In this work, we propose a regression model of excess mortality that corrects for potentially inaccurate background mortality by introducing age-dependent multiplicative parameters through breakpoints, which gives some flexibility. The performance of this model was first assessed with a single and two breakpoints in an intensive simulation study, then the method was applied to French population-based data on colorectal cancer. Results: The proposed model proved to be interesting in the simulations and the applications to real data; it limited the bias in parameter estimates of the excess mortality in several scenarios and improved the results and the generalizability of Touraine's proportional hazards model. Conclusion: Finally, the proposed model is a good approach to correct reliably inaccurate background mortality by introducing multiplicative parameters that depend on age and on an additional variable through breakpoints

    Correcting for heterogeneity and non-comparability bias in multicenter clinical trials with a rescaled random-effect excess hazard model

    No full text
    International audienceIn the presence of competing causes of event occurrence (e.g., death), the interest might not only be in the overall survival but also in the so-called net survival, that is, the hypothetical survival that would be observed if the disease under study were the only possible cause of death. Net survival estimation is commonly based on the excess hazard approach in which the hazard rate of individuals is assumed to be the sum of a disease-specific and expected hazard rate, supposed to be correctly approximated by the mortality rates obtained from general population life tables. However, this assumption might not be realistic if the study participants are not comparable with the general population. Also, the hierarchical structure of the data can induces a correlation between the outcomes of individuals coming from the same clusters (e.g., hospital, registry). We proposed an excess hazard model that corrects simultaneously for these two sources of bias, instead of dealing with them independently as before. We assessed the performance of this new model and compared it with three similar models, using extensive simulation study, as well as an application to breast cancer data from a multicenter clinical trial. The new model performed better than the others in terms of bias, root mean square error, and empirical coverage rate. The proposed approach might be useful to account simultaneously for the hierarchical structure of the data and the non-comparability bias in studies such as long-term multicenter clinical trials, when there is interest in the estimation of net survival
    corecore