STATISTICAL METHODS TO ANALYZE CONTINUOUS RISK VARIABLES IN INDIVIDUAL PATIENT DATA META-ANALYSES: APPLICATION ON A STUDY ON TOBACCO SMOKING AND GASTRIC CANCER RISK IN A CONSORTIUM OF CASE-CONTROL STUDIES (THE STOMACH POOLING (STOP) PROJECT)

Abstract

Gastric cancer represents the fifth most common cancer and the third leading cause of cancer death over both sexes worldwide, with almost 1 million cases and over 700 000 deaths estimated in 2012. The presence of Helicobacter Pylori is a key determinant of gastric cancer. However, other factors, including familial, genetic, environmental and social characteristics appear to also have a role in the etiology of this disease. Tobacco smoking has been associated with increased risk of morbidity and mortality from many diseases and for gastric cancer. Various epidemiologic consortia have been established on several cancers but not yet on gastric cancer. A pooled-analysis of worldwide case-control studies may allow to investigate indebt gastric cancer etiology. Particularly, this large dataset will allow us to better investigate life style characteristics including tobacco smoking, in relation to gastric cancer. The Stomach cancer Pooling (StoP) Project is an international epidemiological consortium. The inclusion criteria for study participation are: a case-control study design (including nested case-control analyses derived from cohort study) and an inclusion of at least 80 cases of gastric cancer (including both cardia and non-cardia location). The aim of my project is to conduct a pooled analysis on data from already available international studies, on the role of tobacco smoking in the etiology of gastric cancer in particular, the number of cigarettes per day and the duration of smoking, using adequate statistical approaches. During the first year of the PhD program, my project was focused on the two-stage analysis. This method is used to analyze meta-analysis and could be applicable in a case of pooled case-control analysis. The first step of the method consists in calculate adjusted study-specific odds ratios (OR) in order to overcome differences across studies in terms of design or population. The second step consists in summarize these study-specific risks using meta-analytic methods which take into account the heterogeneity across studies. During my second year of PhD program, I studied various statistical methods regarding the analysis of non-linear continuous variables. In addition to transform continuous variables in category, I considered more flexible approaches including fractional polynomials. During my third year of PhD program, I focused my research on a way to adapt these latest methods to the analysis of pooled case-control studies. In particular I chose to use factional polynomials in a two-stage method due to their simple interpretation and also because their estimates can be easily pooled through a two-stage analysis. The first step analysis is to perform a fractional polynomial for each study. For each value of the power term (or couple of power terms for the second-order fractional polynomials), the second stage of the model is performed. The pooled dose-response relationship is estimated according to a bivariate random-effects model. The estimate of the trend components could be obtained using restricted maximum likelihood (REML) or maximum likelihood (ML) estimation. The second-stage model is fitted to the data considering each combination of the power terms. The best model, denoted by the optimal power combination is defined as the one minimizing the deviance or the Akaike Information Criterion (AIC), a penalized likelihood which takes into account the number of parameter. We analyzed data on 21 studies including 10,040 cases and 25,602 controls. To investigate the relationship between tobacco smoking and gastric cancer risk, we first used a classical method, building categories of smokers 1) in terms of quantity; \u201cnever smokers\u201d, \u201c20 cigarettes per day\u201d and 2) in terms of smoking duration; \u201cnever smokers\u201d, \u201c30 years of smoking\u201d. We analyzed these variable with a two-stage method. This risk significantly increase with the number of cigarettes per day to reach an OR of 1.29 (95% CI 1.06-1.57 )for smokers of more than 20 cigarettes and, with duration to reach an OR of 1.32 (95% CI 1.17-1.49) for smokers smoking for more than 30 years compared to never smokers. These effects of increasing risk are confirmed by different statistical models of analysis including linear model and fractional polynomials, considering the number of cigarettes per day and the duration as a continuous variable. Results from our analysis confirm that there is an association between cigarette smoking and gastric cancer risk. This risk increases with the number of cigarettes and the duration of smoking. These effects of increasing risk are confirmed by different statistical models of analysis including linear models and fractional polynomials, considering the number of cigarettes per day and the duration as continuous variables. To our knowledge this is the first study using fractional polynomials through a two-stage random effect methods for pooled case-control studies. Through this method we were able to take into account study-specific adjustment variables and heterogeneity across studies thanks to mixed effect modeling. Categorization has the advantage of a simple epidemiologic interpretation and presentation result. However it assumes that the relationship between the risk of disease and the exposure is flat within intervals and also that there is a discontinuity in response when a category cutpoint is crossed, which is unlikely realistic. Considering exposure variables may avoid these limitations. The relationship between cigarette smoking and gastric cancer risk may be discerned from the categorical analysis, but the analysis of the variable in continuous through polynomials brought additional information in particular to understand the possible threshold and possible changes in slopes

    Similar works