107 research outputs found
The stability of simulation based estimation of the multiperiod multinominal probit model with individual specific covariates
The multi-period multinomial Probit model (MMPM) is seen as a flexible tool to explain individual choices among several alternatives over time. There are two versions of this model: a) for each individual the covariates for all alternatives are known and b) for each individual only the parameters of the alternative which was chosen is known. The main difficulty with the MMPM was the calculation of the probability for the individual sequence of chosen alternatives, which requires the computation of the integral over a high dimensional multivariate Normal density. This remedy was removed by the Smooth Recursive Conditional (SRC) simulator. Several simulation studies have investigated the stability of the MMPM estimates with special emphasis to the number of replications of the SRC routine. In contrast to these studies, which use the case of alternative specific covariates, we use the case of the individual specific covariates. We conclude that the MMPM with individual specific covariates is only weakly identified, generalizing Keane's (1992) result for the one period case. As a consequence the maximization of the simulated likelihood often converges to a singular covariance structure so that the SRC-routine stops iterating. This feature cannot be avoided by increasing the number of replications in the SRC-routine. The percentage of these failures rapidly increases with the number of alternatives. --discrete choice models,multi-period multinomial,probit models,simulated maximum likelihood method,smooth recursive conditional simulator,panel data
Teaching and Statistical Training
The availability of well-educated researchers is necessary for the fruitful analysis of social and economic data. The increased data offer made possible by the creation of the Research Data Centers (RDCs) has resulted in an increased demand for PhD students at the masterâs or Diplom levels. Especially in economics, where we find intense competition among the various individual subjects within the course of study, survey statistics has not been very successful in laying claim to a substantial proportion of the coursework and training. The situation is more favorable in sociology faculties. This article argues that the creation of new CAMPUS Files would help foster statistical education by providing public use files covering a wider range of subjects. It also presents some suggestions for new CAMPUS Files along these lines. Additionally, it argues for the establishment of masterâs programs in survey statistics to increase the availability of well-trained statisticians. An outline of such a masterâs program is presented and current PhD programs are evaluated with respect to training in survey statistics. Training courses are also offered outside the university that promote the use of new data sets as well as expanding the knowledge of new statistical methods or methods that lie outside standard education. These training courses are organized by the RDCs, (i.e. the data producers), the Data Service Centers, or by GESIS (Leibniz Institute for the Social Sciences). The current tendency to strengthen ties and collaborate with universities should be supported by making it possible to earn academic credit for such courses.masterâs programs, survey statistics, campus files, statistical training
Assessing the bias due to non-coverage of residential movers in the German microcensus panel: an evaluation using data from the socio-economic panel
The German Microcensus (MC) is a large scale rotating panel survey over three years. The MC is attractive for longitudinal analysis over the entire participation duration because of the mandatory participation and the very high case numbers (about 200 thousand respondents). However, as a consequence of the area sampling that is used for the MC , residential mobility is not covered and consequently statistical information at the new residence is lacking in theMCsample. This raises the question whether longitudinal analyses, like transitions between labour market states, are biased and how different methods perform that promise to reduce such a bias. Based on data of the German Socio-Economic Panel (SOEP), which covers residential mobility, we analysed the effects of missing data of residential movers by the estimation of labour force flows. By comparing the results from the complete SOEP sample and the results from the SOEP, restricted to the non-movers, we concluded that the non-coverage of the residential movers can not be ignored in Rubins sense. With respect to correction methods we analysed weighting by inverse mobility scores and loglinear models for partially observed contingency tables. Our results indicate that weighting by inverse mobility scores reduces the bias to about 60 percent whereas the official longitudinal weights obtained by calibration result in a bias reduction of about 80 percent. The estimation of loglinear models for nonignorable nonresponse leads to very unstable results. --Panel survey,labour market analysis,residential mobility,non-coverage bias,log-linear modelling,inverse probability weighting
Is there a fade-away effect of initial nonresponse bias in EU-SILC?
Nonresponse in surveys may result in a distortion of the distribution of
interest. In a panel survey the participation behavior in later waves is
different from the participation behavior at the start. With register data
that cover also the information for non-respondents one can observe a fade
away of the distributional differences between the distribution of the full
sample, including nonresponders, and the respondent sample, without the
nonrespondents. The mechanics of this effect may be explained by a Markov
chain model. Under suitable regularity conditions the distribution on the
state space converges to the steady state distribution of the chain, which is
independent from the starting distribution of the chain. Therefore the fade-
away effect is considered here as the swing-in into the steady state
distribution. An essential condition for the fade-away effect assumes the same
tran- sition law for the responders and the nonresponders. Such a hypothesis
is investigated here for the Finnish subsample of EU-SILC for the equival-
ized household net-income. The income is grouped into income brackets which
divides the starting sample into quintiles. This analysis is based on register
information. For this analysis the null-hypothesis of equal transition
behavior between income quintiles for responders and nonre- sponders cannot be
rejected. This finding restates a result for Finland for the ECHP (European
Community Household Panel). A second condition concerns the selectivity of
panel attrition after wave one. Here panel attrition must not depend on the
income state of the previous panel wave. The velocity of the swing-in into the
steady state distribution depends on the stability to stay in the same income
state. The stability may vary among the European countries. Therefore we
investigated the transition matrices for 25 EU-SILC countries. We simulated 6
different pattern of nonresponse bias and investigated the fade-away effect
across the waves 2006 to 2009. We found remarkable differences between these
25 coun- tries. Expressed by the relative bias, i.e. bias in 2009 divided by
bias at start in 2006, we found a reduction down to 26 percent of the initial
bias for Bulgaria (foremost reduction) up to 61 percent for Finland (least
reduction). Our results vote for longer observation periods in rotation panels
like EU-SILC
eine Detektivarbeit
Der Aufsatz setzt sich mit dem empirischen Befund auseinander, dass die
unterschiedliche Behandlung der kleinen und der groĂen Gemeinden beim Zensus
2011 zu unterschiedlich hohen Differenzen zwischen BevĂślkerungsfortschreibung
und Zensus fĂźhrt. Dabei weisen die groĂen Gemeinden mit mehr als 10.000
Einwohnern eine im Schnitt um 1.5 Prozentpunkte niedrigere Zensuszahl als in
der BevĂślkerungsfortschreibung aus. Dieser von Christensen et al. (2015)
gezeigte Befund wird mit einem erweiterten nichtparametrischen
Analyseinstrumentarium fĂźr jedes einzelne Bundesland separat re-analysiert.
Hierbei zeigt sich in drei Bundesländern kein derartiger Methodeneffekt des ab
10.000 Einwohnern benutzten Stichprobenverfahrens. Der Aufsatz untersucht die
Frage, warum ein scheinbar so allgemeiner Methodeneffekt sich in drei
Bundesländern nicht zeigt. Es zeigt sich, dass plausible Argumente darauf
hinweisen, dass die Einwohnermelderegister in diesen drei Bundesländern besser
gefßhrt werden als in den anderen Bundesländern. In dieser Sichtweise deckt
das Stichprobenverfahren systematische Mängel der Einwohnermelderegister auf,
während das formal kaum spezifizierte Klärungsverfahren, das bei den kleinen
Gemeinden angewendet wird, diesen Mangel nicht aufdeckt. In diesem Sinne
kĂśnnen sich die groĂen Gemeinden nicht Ăźber eine Benachteiligung bei der
Schätzung ihrer amtlichen Einwohnerzahl beklagen, da der gezeigte Effekt einem
unpräzisen Meldewesen zuzuschreiben ist. Allerdings kommen die kleinen
Gemeinden bei dem "Klärungsverfahren" im Schnitt zu gut weg
Die Zukunft der Statistik: Eine persĂśnliche Betrachtung
Anhand eines persÜnlichen Rßckblicks auf die Entwicklungen in den Bereichen Rechnerentwicklung, Datenzugang und Entwicklung von Statistik-Software werden Trends fßr die zukßnftige Entwicklung der Statistik im Bereich der Wirtschafts- und Sozialwissenschaft hergeleitet. Insbesondere werden die Rolle von R, neue MÜglichkeiten des Datenzugangs, das Verhältnis zur Amtlichen Statistik und die Einfßhrung neuer Studiengänge im Bereich der Statistik angesprochen. Die Darstellung bezieht sich auf den Bereich der Wirtschafts- und Sozialwissenschaften. In anderen Wissenschaftsbereichen, wo die Statistik als Biometrie, Psychometrie etc. firmiert, mÜgen die hier dargestellten Entwicklungstendenzen irrelevant sein.Datenzugang, Statistik-Pakete, R, Amtliche Statistik,Statistik-Studiengänge
The stability of simulation based estimation of the multiperiod multinominal probit model with individual specific covariates
The multi-period multinomial Probit model (MMPM) is seen as a flexible tool to
explain individual choices among several alternatives over time. There are two
versions of this model: a) for each individual the covariates for all
alternatives are known and b) for each individual only the parameters of the
alternative which was chosen is known. The main difficulty with the MMPM was
the calculation of the probability for the individual sequence of chosen
alternatives, which requires the computation of the integral over a high
dimensional multivariate Normal density. This remedy was removed by the Smooth
Recursive Conditional (SRC) simulator. Several simulation studies have
investigated the stability of the MMPM estimates with special emphasis to the
number of replications of the SRC routine. In contrast to these studies, which
use the case of alternative specific covariates, we use the case of the
individual specific covariates. We conclude that the MMPM with individual
specific covariates is only weakly identified, generalizing Keaneâs (1992)
result for the one period case. As a consequence the maximization of the
simulated likelihood often converges to a singular covariance structure so
that the SRC-routine stops iterating. This feature cannot be avoided by
increasing the number of replications in the SRC-routine. The percentage of
these failures rapidly increases with the number of alternatives
an evaluation using data from the socio-economic panel
The German Microcensus (MC) is a large scale rotating panel survey over three
years. The MC is attractive for longitudinal analysis over the entire
participation duration because of the mandatory participation and the very
high case numbers (about 200 thousand respondents). However, as a consequence
of the area sampling that is used for the MC , residential mobility is not
covered and consequently statistical information at the new residence is
lacking in theMCsample. This raises the question whether longitudinal
analyses, like transitions between labour market states, are biased and how
different methods perform that promise to reduce such a bias. Based on data of
the German Socio-Economic Panel (SOEP), which covers residential mobility, we
analysed the effects of missing data of residential movers by the estimation
of labour force flows. By comparing the results from the complete SOEP sample
and the results from the SOEP, restricted to the non-movers, we concluded that
the non-coverage of the residential movers can not be ignored in Rubinâs
sense. With respect to correction methods we analysed weighting by inverse
mobility scores and loglinear models for partially observed contingency
tables. Our results indicate that weighting by inverse mobility scores reduces
the bias to about 60 percent whereas the official longitudinal weights
obtained by calibration result in a bias reduction of about 80 percent. The
estimation of loglinear models for nonignorable nonresponse leads to very
unstable results
Schätzungen von Lohnkurven fßr Westdeutschland mit einem verallgemeinerten Varianz-Komponenten-Modell (Estimates of wage curves for western Germany using a generalised variance-component model)
"In the paper, for the first time in western Germany, estimates of the wage curve are presented using individual panel data. Therefore, it is possible to control for the influence of individual unobserved heterogeneity on the estimation of the wage curve. The relationship between regional wages and the regional unemployment rate diminishes clearly if this is done. Furthermore, panel data makes it possible to control for unobserved regional effects through a specific variance component in the equation to be estimated. The results point to regional effects of a quite remarkable size." (Author's abstract, IAB-Doku) ((en))Lohnkurve, LohnhĂśhe, Arbeitslosigkeit, regionaler Arbeitsmarkt, Westdeutschland, Bundesrepublik Deutschland
Kernel Density Estimation for Heaped Data
In self-reported data usually a phenomenon called `heaping' occurs, i.e.
survey participants round the values of their income, weight or height to some
degree. Additionally, respondents may be more prone to round off or up due to
social desirability. By ignoring the heaping process a severe bias in terms of
spikes and bumps is introduced when applying kernel density methods naively to
the rounded data. A generalized Stochastic Expectation Maximization (SEM)
approach accounting for heaping with potentially asymmetric rounding behaviour
in univariate kernel density estimation is presented in this work. The
introduced methods are applied to survey data of the German Socio-Economic
Panel and exhibit very good performance simulations
- âŚ