46 research outputs found

    TO WEIGHT OR TO ADJUST: AN EMPIRICAL STUDY OF THE DESIGN-BASED AND MODEL-BASED APPROACHES

    Get PDF
    When a sampling design is correlated to the dependent variable, then the distribution of the sampled units is different from that obtained from a simple random sampling design. Then the sampling design is informative, in the sense that if the design variables were not included in the analysis model, even conditional on the covariates, the estimated model parameters can be biased. Questions have been asked about how survey data are modeled when sampling designs are informative. Two fundamental methodologies, design-based and model-based, have been proposed to address this issue. A model-based method--so-called sample distribution method, has been proposed by Krieger and Pfeffermann (1992; 1997) to extract the model of the sample data as a function of the model holding in the population and the sampling design. Once the model holding in the sample data is derived, the standard model-based analysis techniques can be applied to estimate the unknown population parameters. The core topic of this dissertation is to assess various modeling strategies and estimators of regression coefficients and their variance--both design-based and model-based, in particular, the sample distribution method, under the informative sampling design, and to develop a modeling strategy for analysts who are facing this design-based or model-based dilemma. The dissertation is comprised of three research papers that provide 1) an evaluation of the design-based and model-based estimators under a single-stage informative sampling design; 2) an assessment of design-based and model-based estimators under an informative two-stage clustering sampling design; 3) a joint treatment of informative sampling and unit dropouts in longitudinal studies. When a single-stage sampling design is informative, the model-based naïve method--either ordinary least square or maximum likelihood, produces biased results. The design-based method reduces the amount of biases for some parameters (e.g. intercept) but increases variances, which may lead to too conservative conclusions. The sample distribution method produces better estimates in the term of having smaller biases and variances than the naïve and design-based methods. Under an informative two-stage clustering sampling design, ignoring the sampling effect, the model-based naïve method produces biased results. Under some specific assumptions, , the sample distribution method produces better estimators in terms of smaller biases and higher coverage rates compared to the naïve method and the design-based multilevel pseudo likelihood method. Although many previous studies have shown that multilevel pseudo likelihood method is preferred to compensate for the sampling design, this study shows that a rather simpler method--the sample distribution method can be used to address the design effect. In a specific statistical setting, the relative performance of the design-based and the model-based methods for compensating the informative sampling design and dropout has been investigated. The simulation results indicate that both the model-based and the design-based approaches generally work well in the missing at random and missing not at random settings. Moreover, the sample distribution method combined with the Diggle and Kenward model has advantages of correcting the design effect and the nonignorable dropout.Doctor of Philosoph

    Obesity and the timing of cohabitation and marriage

    Get PDF
    The prevalence of adult overweight and obesity has increased substantially in the United States over the past few decades. Besides the health consequences of obesity, it also has social and psychological consequences. As a social marker, it influences individuals' positions or status in a various social contexts and thereby contributes to social stratification. Using The National Longitudinal Study of Adolescent Health (Add Health) data, this paper analyzes the effect of obesity on the likelihood and timing of union formation (marriage and cohabitation) among young adults. The research questions are: Does obesity affect union formation such as cohabitation and marriage? If obesity affects union formation, what are the mechanisms through which it does so? We found that before controlling possible confounding variables, obese young adults will have lower likelihood of entering cohabitation and marriage. After controlling those variables, the difference between obese and non-obese young adult becomes non-significant, but those who are over weighted have higher likelihood to marry and cohabit

    Which Metric on the Space of Collider Events?

    Full text link
    Which is the best metric for the space of collider events? Motivated by the success of the Energy Mover's Distance in characterizing collider events, we explore the larger space of unbalanced optimal transport distances, of which the Energy Mover's Distance is a particular case. Geometric and computational considerations favor an unbalanced optimal transport distance known as the Hellinger-Kantorovich distance, which possesses a Riemannian structure that lends itself to efficient linearization. We develop the particle linearized unbalanced Optimal Transport (pluOT) framework for collider events based on the linearized Hellinger-Kantorovich distance and demonstrate its efficacy in boosted jet tagging. This provides a flexible and computationally efficient optimal transport framework ideally suited for collider physics applications.Comment: 17 pages, 5 figures, 3 table

    Gene by Social Context Interactions for Number of Sexual Partners among White Male Youths: Genetics‐Informed Sociology

    Get PDF
    In this study, we set out to investigate whether introducing molecular genetic measures into an analysis of sexual partner variety will yield novel sociological insights. The data source is the white male DNA sample in the National Longitudinal Study of Adolescent Health. Our empirical analysis has produced a robust protective effect of the 9R/9R genotype relative to the Any10R genotype in the dopamine transporter gene (DAT1). The gene-environment interaction analysis demonstrates that the protective effect of 9R/9R tends to be lost in schools in which higher proportions of students start having sex early or among those with relatively low levels of cognitive ability. Our genetics-informed sociological analysis suggests that the “one size” of a single social theory may not fit all. Explaining a human trait or behavior may require a theory that accommodates the complex interplay between social contextual and individual influences and genetic predispositions

    Peer Influence, Genetic Propensity, and Binge Drinking: A Natural Experiment and a Replication

    Get PDF
    The authors draw data from the College Roommate Study (ROOM) and the National Longitudinal Study of Adolescent Health to investigate gene-environment interaction effects on youth binge drinking. In ROOM, the environmental influence was measured by the precollege drinking behavior of randomly assigned roommates. Random assignment safeguards against friend selection and removes the threat of gene-environment correlation that makes gene-environment interaction effects difficult to interpret. On average, being randomly assigned a drinking peer as opposed to a nondrinking peer increased college binge drinking by 0.5-1.0 episodes per month, or 20%-40% the average amount of binge drinking. However, this peer influence was found only among youths with a medium level of genetic propensity for alcohol use; those with either a low or high genetic propensity were not influenced by peer drinking. A replication of the findings is provided in data drawn from Add Health. The study shows that gene-environment interaction analysis can uncover social-contextual effects likely to be missed by traditional sociological approaches

    Peer Influence, Genetic Propensity, and Binge Drinking: A Natural Experiment and a Replication

    Get PDF
    The authors draw data from the College Roommate Study (ROOM) and the National Longitudinal Study of Adolescent Health to investigate gene-environment interaction effects on youth binge drinking. In ROOM, the environmental influence was measured by the precollege drinking behavior of randomly assigned roommates. Random assignment safeguards against friend selection and removes the threat of gene-environment correlation that makes gene-environment interaction effects difficult to interpret. On average, being randomly assigned a drinking peer as opposed to a nondrinking peer increased college binge drinking by 0.5-1.0 episodes per month, or 20%-40% the average amount of binge drinking. However, this peer influence was found only among youths with a medium level of genetic propensity for alcohol use; those with either a low or high genetic propensity were not influenced by peer drinking. A replication of the findings is provided in data drawn from Add Health. The study shows that gene-environment interaction analysis can uncover social-contextual effects likely to be missed by traditional sociological approaches

    Genetic Bio-Ancestry and Social Construction of Racial Classification in Social Surveys in the Contemporary United States

    Get PDF
    Self-reported race is generally considered the basis for racial classification in social surveys, including the U.S. census. Drawing on recent advances in human molecular genetics and social science perspectives of socially constructed race, our study takes into account both genetic bio-ancestry and social context in understanding racial classification. This article accomplishes two objectives. First, our research establishes geographic genetic bio-ancestry as a component of racial classification. Second, it shows how social forces trump biology in racial classification and/or how social context interacts with bio-ancestry in shaping racial classification. The findings were replicated in two racially and ethnically diverse data sets: the College Roommate Study (N = 2,065) and the National Longitudinal Study of Adolescent Health (N = 2,281)
    corecore