58 research outputs found
Evidence synthesis for prognosis and prediction: application, methodology and use of individual participant data
Prognosis research summarises, explains and predicts future outcomes in patients with a particular condition. This thesis investigates the application and development of evidence synthesis methods for prognosis research, with particular attention given to improving individualised predictions from prognostic models developed and/or validated using metaanalysis techniques.
A review of existing prognostic models for recurrence of venous thromboembolism highlighted several methodological and reporting issues. This motivated the development of a new model to address previous shortcomings, in particular by explicitly modelling and reporting the baseline hazard to enable individualised risk predictions over time. The new model was developed using individual participant data from several studies, using a novel internal-external cross-validation approach. This highlighted the potential for between-study heterogeneity in model performance, and motivated the investigation of recalibration methods to substantially improve consistency in model performance across populations.
Finally, a new multiple imputation method was developed to investigate the impact of missing threshold information in meta-analysis of prognostic test accuracy. Computer code was developed to implement the method, and applied examples indicated missing thresholds could have a potentially large impact on conclusions. A simulation study indicated that the new method generally improves on the current standard, in terms of bias, precision and coverage
Recommended from our members
Deriving percentage study weights in multi-parameter meta-analysis models: with application to meta-regression, network meta-analysis and one-stage individual participant data models.
Many meta-analysis models contain multiple parameters, for example due to multiple outcomes, multiple treatments or multiple regression coefficients. In particular, meta-regression models may contain multiple study-level covariates, and one-stage individual participant data meta-analysis models may contain multiple patient-level covariates and interactions. Here, we propose how to derive percentage study weights for such situations, in order to reveal the (otherwise hidden) contribution of each study toward the parameter estimates of interest. We assume that studies are independent, and utilise a decomposition of Fisher's information matrix to decompose the total variance matrix of parameter estimates into study-specific contributions, from which percentage weights are derived. This approach generalises how percentage weights are calculated in a traditional, single parameter meta-analysis model. Application is made to one- and two-stage individual participant data meta-analyses, meta-regression and network (multivariate) meta-analysis of multiple treatments. These reveal percentage study weights toward clinically important estimates, such as summary treatment effects and treatment-covariate interactions, and are especially useful when some studies are potential outliers or at high risk of bias. We also derive percentage study weights toward methodologically interesting measures, such as the magnitude of ecological bias (difference between within-study and across-study associations) and the amount of inconsistency (difference between direct and indirect evidence in a network meta-analysis)
Two-stage or not two-stage? That is the question for IPD meta-analysis projects
Individual participant data meta-analysis (IPDMA) projects obtain, check, harmonise and synthesise raw data from multiple studies. When undertaking the meta-analysis, researchers must decide between a two-stage or a one-stage approach. In a two-stage approach, the IPD are first analysed separately within each study to obtain aggregate data (e.g., treatment effect estimates and standard errors); then, in the second stage, these aggregate data are combined in a standard meta-analysis model (e.g., common-effect or random-effects). In a one-stage approach, the IPD from all studies are analysed in a single step using an appropriate model that accounts for clustering of participants within studies and, potentially, between-study heterogeneity (e.g., a general or generalised linear mixed model). The best approach to take is debated in the literature, and so here we provide clearer guidance for a broad audience. Both approaches are important tools for IPDMA researchers and neither are a panacea. If most studies in the IPDMA are small (few participants or events), a one-stage approach is recommended due to using a more exact likelihood. However, in other situations, researchers can choose either approach, carefully following best practice. Some previous claims recommending to always use a one-stage approach are misleading, and the two-stage approach will often suffice for most researchers. When differences do arise between the two approaches, often it is caused by researchers using different modelling assumptions or estimation methods, rather than using one or two stages per se
Calculating the power of a planned individual participant data metaâanalysis of randomised trials to examine a treatmentâcovariate interaction with a timeâtoâevent outcome
Before embarking on an individual participant data meta-analysis (IPDMA) project, researchers should consider the power of their planned IPDMA conditional on the studies promising their IPD and their characteristics. Such power estimates help inform whether the IPDMA project is worth the time and funding investment, before IPD are collected. Here, we suggest how to estimate the power of a planned IPDMA of randomised trials aiming to examine treatment-covariate interactions at the participant-level (i.e., treatment effect modifiers). We focus on a time-to-event (survival) outcome with a binary or continuous covariate, and propose an approximate analytic power calculation that conditions on the actual characteristics of trials, for example, in terms of sample sizes and covariate distributions. The proposed method has five steps: (i) extracting the following aggregate data for each group in each trialâthe number of participants and events, the mean and SD for each continuous covariate, and the proportion of participants in each category for each binary covariate; (ii) specifying a minimally important interaction size; (iii) deriving an approximate estimate of Fisher's information matrix for each trial and the corresponding variance of the interaction estimate per trial, based on assuming an exponential survival distribution; (iv) deriving the estimated variance of the summary interaction estimate from the planned IPDMA, under a common-effect assumption, and (v) calculating the power of the IPDMA based on a two-sided Wald test. Stata and R code are provided and a real example provided for illustration. Further evaluation in real examples and simulations is needed
Calculating the power of a planned individual participant data meta-analysis of randomised trials to examine a treatment-covariate interaction with a time-to-event outcome
Before embarking on an individual participant data meta-analysis (IPDMA) project, researchers should consider the power of their planned IPDMA conditional on the studies promising their IPD and their characteristics. Such power estimates help inform whether the IPDMA project is worth the time and funding investment, before IPD are collected. Here, we suggest how to estimate the power of a planned IPDMA of randomised trials aiming to examine treatment-covariate interactions at the participant-level (i.e., treatment effect modifiers). We focus on a time-to-event (survival) outcome with a binary or continuous covariate, and propose an approximate analytic power calculation that conditions on the actual characteristics of trials, for example, in terms of sample sizes and covariate distributions. The proposed method has five steps: (i) extracting the following aggregate data for each group in each trialâthe number of participants and events, the mean and SD for each continuous covariate, and the proportion of participants in each category for each binary covariate; (ii) specifying a minimally important interaction size; (iii) deriving an approximate estimate of Fisher's information matrix for each trial and the corresponding variance of the interaction estimate per trial, based on assuming an exponential survival distribution; (iv) deriving the estimated variance of the summary interaction estimate from the planned IPDMA, under a common-effect assumption, and (v) calculating the power of the IPDMA based on a two-sided Wald test. Stata and R code are provided and a real example provided for illustration. Further evaluation in real examples and simulations is needed
Calculating the power of a planned individual participant data metaâanalysis to examine prognostic factor effects for a binary outcome
Collecting data for an individual participant data metaâanalysis (IPDMA) project can be time consuming and resource intensive and could still have insufficient power to answer the question of interest. Therefore, researchers should consider the power of their planned IPDMA before collecting IPD. Here we propose a method to estimate the power of a planned IPDMA project aiming to synthesise multiple cohort studies to investigate the (unadjusted or adjusted) effects of potential prognostic factors for a binary outcome. We consider both binary and continuous factors and provide a threeâstep approach to estimating the power in advance of collecting IPD, under an assumption of the true prognostic effect of each factor of interest. The first step uses routinely available (published) aggregate data for each study to approximate Fisher's information matrix and thereby estimate the anticipated variance of the unadjusted prognostic factor effect in each study. These variances are then used in step 2 to estimate the anticipated variance of the summary prognostic effect from the IPDMA. Finally, step 3 uses this variance to estimate the corresponding IPDMA power, based on a twoâsided Wald test and the assumed true effect. Extensions are provided to adjust the power calculation for the presence of additional covariates correlated with the prognostic factor of interest (by using a variance inflation factor) and to allow for betweenâstudy heterogeneity in prognostic effects. An example is provided for illustration, and Stata code is supplied to enable researchers to implement the method
Minimum sample size for developing a multivariable prediction model using multinomial logistic regression
Aims
Multinomial logistic regression models allow one to predict the risk of a categorical outcome with > 2 categories. When developing such a model, researchers should ensure the number of participants (n)) is appropriate relative to the number of events (Ek)) and the number of predictor parameters (pk) for each category k. We propose three criteria to determine the minimum n required in light of existing criteria developed for binary outcomes.
Proposed criteria
The first criterion aims to minimise the model overfitting. The second aims to minimise the difference between the observed and adjusted R2 Nagelkerke. The third criterion aims to ensure the overall risk is estimated precisely. For criterion (i), we show the sample size must be based on the anticipated Cox-snell R2 of distinct âone-to-oneâ logistic regression models corresponding to the sub-models of the multinomial logistic regression, rather than on the overall Cox-snell R2 of the multinomial logistic regression.
Evaluation of criteria
We tested the performance of the proposed criteria (i) through a simulation study and found that it resulted in the desired level of overfitting. Criterion (ii) and (iii) were natural extensions from previously proposed criteria for binary outcomes and did not require evaluation through simulation.
Summary
We illustrated how to implement the sample size criteria through a worked example considering the development of a multinomial risk prediction model for tumour type when presented with an ovarian mass. Code is provided for the simulation and worked example. We will embed our proposed criteria within the pmsampsize R library and Stata modules
Individual participant data metaâanalysis to examine linear or nonâlinear treatmentâcovariate interactions at multiple timeâpoints for a continuous outcome
Individual participant data (IPD) metaâanalysis projects obtain, harmonise, and synthesise original data from multiple studies. Many IPD metaâanalyses of randomised trials are initiated to identify treatment effect modifiers at the individual level, thus requiring statistical modelling of interactions between treatment effect and participantâlevel covariates. Using a twoâstage approach, the interaction is estimated in each trial separately and combined in a metaâanalysis. In practice, two complications often arise with continuous outcomes: examining nonâlinear relationships for continuous covariates and dealing with multiple timeâpoints. We propose a twoâstage multivariate IPD metaâanalysis approach that summarises nonâlinear treatmentâcovariate interaction functions at multiple timeâpoints for continuous outcomes. A setâup phase is required to identify a small set of timeâpoints; relevant knot positions for a spline function, at identical locations in each trial; and a common reference group for each covariate. Crucially, the multivariate approach can include participants or trials with missing outcomes at some timeâpoints. In the first stage, restricted cubic spline functions are fitted and their interaction with each discrete timeâpoint is estimated in each trial separately. In the second stage, the parameter estimates defining these multiple interaction functions are jointly synthesised in a multivariate randomâeffects metaâanalysis model accounting for withinâtrial and acrossâtrial correlation. These metaâanalysis estimates define the summary nonâlinear interactions at each timeâpoint, which can be displayed graphically alongside confidence intervals. The approach is illustrated using an IPD metaâanalysis examining effect modifiers for exercise interventions in osteoarthritis, which shows evidence of nonâlinear relationships and small gains in precision by analysing all timeâpoints jointly
- âŠ