65 research outputs found

    Usefulness and estimation of proportionality constraints

    Get PDF
    Stata has for a long time the capability of imposing the constraint that parameters are a linear function of one another. It does not have the capability to impose the constraint that if a set of parameters change (due to interaction terms) they will maintain the relative differences among them. Such a proportionality constraint has a nice interpretation: the constrained variables together measure some latent concept. For instance if a proportionality constraint is imposed on the variables father’s education, mother’s education, father’s occupational status, and mother’s occupational status, than together they might be thought to measure the latent variable family socioeconomic status. With the proportionality constraint one can estimate the effect of the latent variable and how strong each observed variable loads on the latent variable (i.e. does the mother, the father, or the highest status parent matter most). Such a model is a special case of a so called MIMIC model. In principle these models can be estimated using standard ml algorithms, however as the parameters are rather strongly correlated ml has a hard time finding the maximum. An EM algorithm is proposed that will find the maximum. This maximum is than fed into ml to get the right standard errors.

    Analyzing proportions

    Get PDF
    In this talk, I will discuss some techniques available in Stata for analyzing dependent variables that are proportions. I will discuss four programs: betafit, glm, dirifit, and fmlogit. The first two deal with situations where we want to explain only one proportion, while the latter two deal with situations where we have for each observation multiple proportions that must add up to one. I will focus on how to interpret the results of these models and on the relative strengths and weaknesses of these models.

    Using and interpreting restricted cubic splines

    Get PDF
    Sometimes one wants to model the effect of a variable as a nonlinear smooth curve. A convenient choice for such a curve is a restricted cubic spline. This option has existed in Stata for a while through user-written programs, but as of Stata 10, the mkspline command in combination with the cubic option has been implemented in official Stata. In this talk, I will briefly introduce splines and restricted cubic splines, but I will mainly focus on what happens after one has estimated a model with a restricted cubic spline, and in particular how the postrcspline package can help in the interpretation of the results.

    Assessing the reasonableness of an imputation model

    Get PDF
    Multiple imputation is a popular way of dealing with missing values under the missing at random (MAR) assumption. Imputation models can become quite complicated, for instance, when the model of substantive interest contains many interactions or when the data originate from a nested design. This paper will discuss two methods to assess how plausible the results are. The first method consists of comparing the point estimates obtained by multiple imputation with point estimates obtained by another method for controlling for bias due to missing data. Second, the changes in standard error between the model that ignores the missing cases and the multiple imputation model are decomposed into three components: changes due to changes in sample size, changes due to uncertainty in the imputation model used in multiple imputation, and changes due to changes in the estimates that underlie the standard error. This decomposition helps in assessing the reasonableness of the change in standard error. These two methods will be illustrated with two new user written Stata commands.

    Modeling for response variables that are proportions

    Get PDF
    When dealing with response variables that are proportions, people often use regress. This approach can be problematic since the model can lead to predicted proportions less than zero or more than one and errors that are likely to be heteroskedastic and nonnormally distributed. This talk will discuss three more appropriate methods for proportions as response variables: betafit, dirifit, and glm. betafit is a maximum likelihood estimator using a beta likelihood, dirifit is a maximum likelihood estimator using a Dirichlet likelihood, and glm can be used to create a quasi–maximum likelihood estimator using a binomial likelihood. On an applied level, a difference between dirifit and the others is that the others can handle only one response variable, whereas dirifit can handle multiple response variables. For instance, betafit and glm can model the proportion of city budget spent on the category security (police and fire department), whereas dirifit can simultaneously model the proportions spent on categories security, social policy, infrastructure, and other. Another difference between betafit and glm is that glm can handle a proportion of exactly zero and one, whereas betafit can handle only proportions between zero and one. Special attention will be given on how to fit these models in Stata and on how to interpret the results. This presentation will end with a warning not to use any of these techniques for ecological inference, i.e., using aggregated data to infer about individual units. To use a classic example: In the United States in the 1930s, states with a high proportion of immigrants also had a high literacy rate (in the English language), whereas immigrants were on average less literate than nonimmigrants. Regressing state level literacy rate on state level proportion of immigrants would thus give a completely wrong picture about the relationship between individual immigrant status and literacy.

    Class, status, and education : the influence of parental resources on IEO in Europe, 1893-1987

    Get PDF
    Background of INCASI Project H2020-MSCA-RISE-2015 GA 691004. WP1: CompilationThere is a long tradition of studying the influence of parental background on educational attainment of the offspring. Recently the emphasis in this tradition has shifted to the question of what parental background is. In particular, what contributes to social background, for example parental occupational status, parental occupational class, and/or parental education? Moreover, who contributes to parental background, the mother, the father, or both? In this article we asked the question whether these different components of parental background are stable across time and across countries, or are some components more important in some countries or periods than in other countries or periods. We were able to reject the hypothesis that the contributions of the different components were constant across 29 European countries. In most of these countries we were also able to reject that these contributions were constant over time

    Me, My Girls, and the Ideal Hotel: Segmenting Motivations of the Girlfriend Getaway Market Using Fuzzy C-Medoids for Fuzzy Data.

    Get PDF
    Segmenting the motivation of travelers using the push and pull framework remains ubiquitous in tourism. This study segments the girlfriend getaway (GGA) market on motivation (push) and accommodation (pull) attributes and identifies relationships between these factors. Using a relatively novel clustering algorithm, the Fuzzy C-Medoids clustering for fuzzy data (FCM-FD), on a sample of 749 women travelers, three segments (Socializers, Enjoyers, and Rejoicers) are uncovered. The results of a multinomial fractional model show relationships between the clusters of motivation and accommodation attributes as well as sociodemographic characteristics. The research highlights the importance of using a gendered perspective in applying well established motivation models such as the push and pull framework. The findings have implications for both destination and accommodation management
    • …
    corecore