396 research outputs found
Teaching and Statistical Training
The availability of well-educated researchers is necessary for the fruitful analysis of social and economic data. The increased data offer made possible by the creation of the Research Data Centers (RDCs) has resulted in an increased demand for PhD students at the master’s or Diplom levels. Especially in economics, where we find intense competition among the various individual subjects within the course of study, survey statistics has not been very successful in laying claim to a substantial proportion of the coursework and training. The situation is more favorable in sociology faculties. This article argues that the creation of new CAMPUS Files would help foster statistical education by providing public use files covering a wider range of subjects. It also presents some suggestions for new CAMPUS Files along these lines. Additionally, it argues for the establishment of master’s programs in survey statistics to increase the availability of well-trained statisticians. An outline of such a master’s program is presented and current PhD programs are evaluated with respect to training in survey statistics. Training courses are also offered outside the university that promote the use of new data sets as well as expanding the knowledge of new statistical methods or methods that lie outside standard education. These training courses are organized by the RDCs, (i.e. the data producers), the Data Service Centers, or by GESIS (Leibniz Institute for the Social Sciences). The current tendency to strengthen ties and collaborate with universities should be supported by making it possible to earn academic credit for such courses.master’s programs, survey statistics, campus files, statistical training
The stability of simulation based estimation of the multiperiod multinominal probit model with individual specific covariates
The multi-period multinomial Probit model (MMPM) is seen as a flexible tool to explain individual choices among several alternatives over time. There are two versions of this model: a) for each individual the covariates for all alternatives are known and b) for each individual only the parameters of the alternative which was chosen is known. The main difficulty with the MMPM was the calculation of the probability for the individual sequence of chosen alternatives, which requires the computation of the integral over a high dimensional multivariate Normal density. This remedy was removed by the Smooth Recursive Conditional (SRC) simulator. Several simulation studies have investigated the stability of the MMPM estimates with special emphasis to the number of replications of the SRC routine. In contrast to these studies, which use the case of alternative specific covariates, we use the case of the individual specific covariates. We conclude that the MMPM with individual specific covariates is only weakly identified, generalizing Keane's (1992) result for the one period case. As a consequence the maximization of the simulated likelihood often converges to a singular covariance structure so that the SRC-routine stops iterating. This feature cannot be avoided by increasing the number of replications in the SRC-routine. The percentage of these failures rapidly increases with the number of alternatives. --discrete choice models,multi-period multinomial,probit models,simulated maximum likelihood method,smooth recursive conditional simulator,panel data
Design-oriented weighting of a household panel
The method of inverse sampling probabilities is adopted to calculate weights for a household panel. The method generates longitudinal as well as cross-sectional weights, which reflect the subsequent sampling stages of the panel and the different possibilities of households to enter the panel
Is there a fade-away effect of initial nonresponse bias in EU-SILC?
Nonresponse in surveys may result in a distortion of the distribution of
interest. In a panel survey the participation behavior in later waves is
different from the participation behavior at the start. With register data
that cover also the information for non-respondents one can observe a fade
away of the distributional differences between the distribution of the full
sample, including nonresponders, and the respondent sample, without the
nonrespondents. The mechanics of this effect may be explained by a Markov
chain model. Under suitable regularity conditions the distribution on the
state space converges to the steady state distribution of the chain, which is
independent from the starting distribution of the chain. Therefore the fade-
away effect is considered here as the swing-in into the steady state
distribution. An essential condition for the fade-away effect assumes the same
tran- sition law for the responders and the nonresponders. Such a hypothesis
is investigated here for the Finnish subsample of EU-SILC for the equival-
ized household net-income. The income is grouped into income brackets which
divides the starting sample into quintiles. This analysis is based on register
information. For this analysis the null-hypothesis of equal transition
behavior between income quintiles for responders and nonre- sponders cannot be
rejected. This finding restates a result for Finland for the ECHP (European
Community Household Panel). A second condition concerns the selectivity of
panel attrition after wave one. Here panel attrition must not depend on the
income state of the previous panel wave. The velocity of the swing-in into the
steady state distribution depends on the stability to stay in the same income
state. The stability may vary among the European countries. Therefore we
investigated the transition matrices for 25 EU-SILC countries. We simulated 6
different pattern of nonresponse bias and investigated the fade-away effect
across the waves 2006 to 2009. We found remarkable differences between these
25 coun- tries. Expressed by the relative bias, i.e. bias in 2009 divided by
bias at start in 2006, we found a reduction down to 26 percent of the initial
bias for Bulgaria (foremost reduction) up to 61 percent for Finland (least
reduction). Our results vote for longer observation periods in rotation panels
like EU-SILC
Assessing the bias due to non-coverage of residential movers in the German microcensus panel: an evaluation using data from the socio-economic panel
The German Microcensus (MC) is a large scale rotating panel survey over three years. The MC is attractive for longitudinal analysis over the entire participation duration because of the mandatory participation and the very high case numbers (about 200 thousand respondents). However, as a consequence of the area sampling that is used for the MC , residential mobility is not covered and consequently statistical information at the new residence is lacking in theMCsample. This raises the question whether longitudinal analyses, like transitions between labour market states, are biased and how different methods perform that promise to reduce such a bias. Based on data of the German Socio-Economic Panel (SOEP), which covers residential mobility, we analysed the effects of missing data of residential movers by the estimation of labour force flows. By comparing the results from the complete SOEP sample and the results from the SOEP, restricted to the non-movers, we concluded that the non-coverage of the residential movers can not be ignored in Rubins sense. With respect to correction methods we analysed weighting by inverse mobility scores and loglinear models for partially observed contingency tables. Our results indicate that weighting by inverse mobility scores reduces the bias to about 60 percent whereas the official longitudinal weights obtained by calibration result in a bias reduction of about 80 percent. The estimation of loglinear models for nonignorable nonresponse leads to very unstable results. --Panel survey,labour market analysis,residential mobility,non-coverage bias,log-linear modelling,inverse probability weighting
Documentation of sample sizes and panel attrition in the German Socio Economic Panel (GSOEP) (1984 until 1995) [Subsamples A, B, C]
Es wird ein Überblick über das Erhebungsdesign der ersten Welle der SOEP Teilstichproben A (West-Deutsche), B (Ausländer) und C (Ost-Deutsche) gegeben. Weiterhin werden die Weiterverfolgungsregeln des SOEP beschrieben. Die Entwicklung der Fallzahlen wird getrennt für die Teilstichproben im Quer- und im Längsschnitt dargestellt. Ab Welle 7 (1990) werden die Wanderungen der Stichprobenmitglieder zwischen Ost- und Westdeutschland dokumentiert. Dabei wird auch die Trennung nach Privathaushalten und dem Anstaltsbereich berücksichtigt. Die erhebungsbedingten Ausfälle werden auf zwei Ebenen beschrieben: Ausfälle aufgrund von Kontaktverlust und Ausfälle bei wiedererreichten Haushalten aufgrund von nicht gewährten Interviews. Auf diesen zwei Ebenen werden Ausfallraten nach einzelnen Haushaltsmerkmalen als auch die Schätzergebnisse bei Berücksichtigung mehrerer Merkmale dokumentiert. Die Schätzungen basieren auf Modellen der multiplen logistischen Regression. Sie sind die empirische Grundlage für die Längsschnittgewichtung des SOEP. Die Dokumentation schließt mit einem Hinweis auf weiterführende Literatur
Dokumentation des Sozio-oekonomischen Panels (SOEP): Erhebungsdesign, Fallzahlen und erhebungsbedingte Ausfälle sowie die Schätzung von Ausfallwahrscheinlichkeiten bis Welle 12 (1984 bis 1995) [Stichprobe A,B und C]
Es wird ein Überblick über das Erhebungsdesign der ersten Welle der SOEP Teilstichproben A (West-Deutsche), B (Ausländer) und C (Ost-Deutsche) gegeben. Weiterhin werden die Weiterverfolgungsregeln des SOEP beschrieben. Die Entwicklung der Fallzahlen wird getrennt für die Teilstichproben im Quer- und im Längsschnitt dargestellt. Ab Welle 7 (1990) werden die Wanderungen der Stichprobenmitglieder zwischen Ost- und Westdeutschland dokumentiert. Dabei wird auch die Trennung nach Privathaushalten und dem Anstaltsbereich berücksichtigt. Die erhebungsbedingten Ausfälle werden auf zwei Ebenen beschrieben: Ausfälle aufgrund von Kontaktverlust und Ausfälle bei wiedererreichten Haushalten aufgrund von nicht gewährten Interviews. Auf diesen zwei Ebenen werden Ausfallraten nach einzelnen Haushaltsmerkmalen als auch die Schätzergebnisse bei Berücksichtigung mehrerer Merkmale dokumentiert. Die Schätzungen basieren auf Modellen der multiplen logistischen Regression. Sie sind die empirische Grundlage für die Längsschnittgewichtung des SOEP. Die Dokumentation schließt mit einem Hinweis auf weiterführende Literatur
Die Zukunft der Statistik: Eine persönliche Betrachtung
Anhand eines persönlichen Rückblicks auf die Entwicklungen in den Bereichen Rechnerentwicklung, Datenzugang und Entwicklung von Statistik-Software werden Trends für die zukünftige Entwicklung der Statistik im Bereich der Wirtschafts- und Sozialwissenschaft hergeleitet. Insbesondere werden die Rolle von R, neue Möglichkeiten des Datenzugangs, das Verhältnis zur Amtlichen Statistik und die Einführung neuer Studiengänge im Bereich der Statistik angesprochen. Die Darstellung bezieht sich auf den Bereich der Wirtschafts- und Sozialwissenschaften. In anderen Wissenschaftsbereichen, wo die Statistik als Biometrie, Psychometrie etc. firmiert, mögen die hier dargestellten Entwicklungstendenzen irrelevant sein.Datenzugang, Statistik-Pakete, R, Amtliche Statistik,Statistik-Studiengänge
- …
