7,831 research outputs found

    Generalized Linear Mixed Models for Randomized Responses

    Get PDF
    Response bias (nonresponse and social desirability bias) is one of the main concerns when asking sensitive questions about behavior and attitudes. Self-reports on sensitive issues as in health research (e.g., drug and alcohol abuse), and social and behavioral sciences (e.g., attitudes against refugees, academic cheating) can be expected to be subject to considerable misreporting. To diminish misreporting on self-reports, indirect questioning techniques have been proposed such as the randomized response techniques. The randomized response techniques avoid a direct link between individual's response and the sensitive question, thereby protecting the individual's privacy. Next to the development of the innovative data collection methods, methodological advances have been made to enable a multivariate analysis to relate responses to sensitive questions to other variables. It is shown that the developments can be represented by a general response probability model (including all common designs) by extending it to a generalized linear model (GLM) or a generalized linear mixed model (GLMM). The general methodology is based on modifying common link functions to relate a linear predictor to the randomized response. This approach makes it possible to use existing software for GLMs and GLMMs to model randomized response data. The R-package GLMMRR makes the advanced methodology available to applied researchers. The extended models and software will seriously improve the application of the randomized response methodology. Three empirical examples are given to illustrate the methods

    Analysis of neonatal clinical trials with twin births

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In neonatal trials of pre-term or low-birth-weight infants, twins may represent 10–20% of the study sample. Mixed-effects models and generalized estimating equations are common approaches for handling correlated continuous or binary data. However, the operating characteristics of these methods for mixes of correlated and independent data are not well established.</p> <p>Methods</p> <p>Simulation studies were conducted to compare mixed-effects models and generalized estimating equations to linear regression for continuous outcomes. Similarly, mixed-effects models and generalized estimating equations were compared to ordinary logistic regression for binary outcomes. The parameter of interest is the treatment effect in two-armed clinical trials. Data from the National Institute of Child Health & Human Development Neonatal Research Network are used for illustration.</p> <p>Results</p> <p>For continuous outcomes, while the coverage never fell below 0.93, and the type I error rate never exceeded 0.07 for any method, overall linear mixed-effects models performed well with respect to median bias, mean squared error, coverage, and median width. For binary outcomes, the coverage never fell below 0.90, and the type I error rate never exceeded 0.07 for any method. In these analyses, when randomization of twins was to the same treatment group or done independently, ordinary logistic regression performed best. When randomization of twins was to opposite treatment arms, a rare method of randomization in this setting, ordinary logistic regression still performed adequately. Overall, generalized linear mixed models showed the poorest coverage values.</p> <p>Conclusion</p> <p>For continuous outcomes, using linear mixed-effects models for analysis is preferred. For binary outcomes, in this setting where the amount of related data is small, but non-negligible, ordinary logistic regression is recommended.</p

    Asking Sensitive Questions Using the Crosswise Model: An Experimental Survey Measuring Plagiarism

    Get PDF
    Yu, Tian, and Tang (2008) proposed two new techniques for asking questions on sensitive topics in population surveys: the triangular model (TM) and the crosswise model (CM). The two models can be used as alternatives to the well-known randomized response technique (RRT) and are meant to overcome some of the drawbacks of the RRT. Although Yu, Tian, and Tang provide a promising theoretical analysis of the proposed models, they did not test them. We therefore provide results from an experimental survey in which the crosswise model was implemented and compared to direct questioning. To our knowledge, this is the first empirical evaluation of the crosswise model. We focused on the crosswise model because it seems better suited than the triangular model to overcome the self-protective "no” bias observed for the RRT. This paper-and-pencil survey on plagiarism was administered to Swiss and German students in university classrooms. Results suggest that the CM is a promising data-collection instrument eliciting more socially undesirable answers than direct questionin

    Decision trees in epidemiological research

    Get PDF
    Background: In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. Main text: We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Conclusions: Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation
    corecore