1,583 research outputs found

    Missing at random, likelihood ignorability and model completeness

    Full text link
    This paper provides further insight into the key concept of missing at random (MAR) in incomplete data analysis. Following the usual selection modelling approach we envisage two models with separable parameters: a model for the response of interest and a model for the missing data mechanism (MDM). If the response model is given by a complete density family, then frequentist inference from the likelihood function ignoring the MDM is valid if and only if the MDM is MAR. This necessary and sufficient condition also holds more generally for models for coarse data, such as censoring. Examples are given to show the necessity of the completeness of the underlying model for this equivalence to hold

    The radial plot in meta-analysis : approximations and applications

    Get PDF
    Fixed effects meta-analysis can be thought of as least squares analysis of the radial plot, the plot of standardized treatment effect against precision (reciprocal of the standard deviation) for the studies in a systematic review. For example, the least squares slope through the origin estimates the treatment effect, and a widely used test for publication bias is equivalent to testing the significance of the regression intercept. However, the usual theory assumes that the within-study variances are known, whereas in practice they are estimated. This leads to extra variability in the points of the radial plot which can lead to a marked distortion in inferences that are derived from these regression calculations. This is illustrated by a clinical trials example from the Cochrane database. We derive approximations to the sampling properties of the radial plot and suggest bias corrections to some of the commonly used methods of meta-analysis. A simulation study suggests that these bias corrections are effective in controlling levels of significance of tests and coverage of confidence intervals

    Confidence intervals and P-valves for meta analysis with publication bias

    Get PDF
    We study publication bias in meta analysis by supposing there is a population (y, σ) of studies which give treatment effect estimates y ~ N(θ, σ2). A selection function describes the probability that each study is selected for review. The overall estimate of θ depends on the studies selected, and hence on the (unknown) selection function. Our previous paper, Copas and Jackson (2004, A bound for publication bias based on the fraction of unpublished studies, Biometrics 60, 146-153), studied the maximum bias over all possible selection functions which satisfy the weak condition that large studies (small σ) are as likely, or more likely, to be selected than small studies (large σ). This led to a worstcase sensitivity analysis, controlling for the overall fraction of studies selected. However, no account was taken of the effect of selection on the uncertainty in estimation. This paper extends the previous work by finding corresponding confidence intervals and P-values, and hence a new sensitivity analysis for publication bias. Two examples are discussed

    Studies in compound decisions

    Get PDF
    Imperial Users onl

    Selection models with monotone weight functions in meta analysis

    Full text link
    Publication bias, the fact that studies identified for inclusion in a meta analysis do not represent all studies on the topic of interest, is commonly recognized as a threat to the validity of the results of a meta analysis. One way to explicitly model publication bias is via selection models or weighted probability distributions. We adopt the nonparametric approach initially introduced by Dear (1992) but impose that the weight function ww is monotonely non-increasing as a function of the pp-value. Since in meta analysis one typically only has few studies or "observations", regularization of the estimation problem seems sensible. In addition, virtually all parametric weight functions proposed so far in the literature are in fact decreasing. We discuss how to estimate a decreasing weight function in the above model and illustrate the new methodology on two well-known examples. The new approach potentially offers more insight in the selection process than other methods and is more flexible than parametric approaches. Some basic properties of the log-likelihood function and computation of a pp-value quantifying the evidence against the null hypothesis of a constant weight function are indicated. In addition, we provide an approximate selection bias adjusted profile likelihood confidence interval for the treatment effect. The corresponding software and the datasets used to illustrate it are provided as the R package selectMeta. This enables full reproducibility of the results in this paper.Comment: 15 pages, 2 figures. Some minor changes according to reviewer comment

    Men who have sex with men: a comparison of a probability sample survey and a community based study

    Get PDF
    We compared characteristics of men who have sex with men (MSM) in a probability sample survey with a community based study in London. The majority of men in both surveys reported male sex partner(s) in the last year but MSM recruited through the population based survey had lower levels of HIV risk behaviour, reported fewer sexually transmitted infections and HIV testing than those recruited from gay venues. Community samples are likely to overestimate levels of risk behaviour among all MSM

    The analysis of data where response or selection is dependent on the variable of interest

    Get PDF
    In surveys of sensitive subjects non response may be dependent on the variable of interest, both at the unit and item levels. In some clinical and epidemiological studies, units are selected for entry on the basis of the outcome variable of interest. Both of these scenarios pose problems for statistical analysis, and standard techniques may be invalid or inefficient, except in some special cases. A new approach to the analysis of surveys of sensitive topics is developed, central to which is at least one variable which represents the enthusiasm to participate. This variable is included along with demographic variables in the calculation of a response propensity score. The score is derived as the fitted probabilities of item non-response to the question of interest. The distribution of the score for the unit non-responders is assumed equal to that of item non-responders. Response is assumed independent of the variable of interest, conditional on the score. Weights based on the score can be used to derive unbiased estimates of the distribution of the variable of interest. The bootstrap is recommended for confidence interval construction. The technique is applied to data from the National Survey of Sexual Attitudes and Lifestyles. A simplification of the technique is developed that does not use the bootstrap, and which enables users to analyse the data without knowledge of the factors affecting non-response, and using standard statistical software. To analyse the time from an initiating event to illness, a prospective study may be regarded as the optimal design. However, additional data from those already with the illness and still alive may also be available. A standard technique would be to ignore the additional data, and left-truncate the times to illness at study entry. We develop a full likelihood approach, and a weighted pseudo likelihood approach, and compare these with the standard truncated data approach. The techniques are used to fit simple models of time to illness based on data from a study of time to AIDS from HIV seroconversion

    Optimal design of cluster randomised trials with continuous recruitment and prospective baseline period

    Get PDF
    BACKGROUND: Cluster randomised trials, like individually randomised trials, may benefit from a baseline period of data collection. We consider trials in which clusters prospectively recruit or identify participants as a continuous process over a given calendar period, and ask whether and for how long investigators should collect baseline data as part of the trial, in order to maximise precision. METHODS: We show how to calculate and plot the variance of the treatment effect estimator for different lengths of baseline period in a range of scenarios, and offer general advice. RESULTS: In some circumstances it is optimal not to include a baseline, while in others there is an optimal duration for the baseline. All other things being equal, the circumstances where it is preferable not to include a baseline period are those with a smaller recruitment rate, smaller intracluster correlation, greater decay in the intracluster correlation over time, or wider transition period between recruitment under control and intervention conditions. CONCLUSION: The variance of the treatment effect estimator can be calculated numerically, and plotted against the duration of baseline to inform design. It would be of interest to extend these investigations to cluster randomised trial designs with more than two randomised sequences of control and intervention condition, including stepped wedge designs
    corecore