270 research outputs found

    General-purpose imputation of planned missing data in social surveys: Different strategies and their effect on correlations

    Get PDF
    Planned missing survey data, for example stemming from split questionnaire designs are becoming increasingly common in survey research, making imputation indispensable to obtain reasonably analyzable data. However, these data can be difficult to impute due to low correlations, many predictors, and limited sample sizes to support imputation models. This paper presents findings from a Monte Carlo simulation, in which we investigate the accuracy of correlations after multiple imputation using different imputation methods and predictor set specifications based on data from the German Internet Panel (GIP). The results show that strategies that simplify the imputation exercise (such as predictive mean matching with dimensionality reduction or restricted predictor sets, linear regression models, or the multivariate normal model without transformation) perform well, while especially generalized linear models for categorical data, classification trees, and imputation models with many predictor variables lead to strong biases.Geplant fehlende Werte in sozialwissenschaftlichen Befragungen, beispielsweise infolge eines Split Questionnaire Designs, treten in der Umfrageforschung immer häufiger auf. Um hinlänglich analysierbare Daten zu erhalten, ist hierbei oftmals eine Imputation erforderlich. Die statistische Modellierung bei der Imputation solcher Daten kann jedoch aufgrund niedriger Korrelationen, einer Großzahl möglicher Prädiktoren und begrenzter Stichprobengrößen mit enormen Herausforderungen verbunden sein. Der vorliegende Beitrag stellt Ergebnisse aus einer Monte-Carlo-Simulation vor, in der basierend auf Daten des German Internet Panels (GIP) die Validität von Korrelationsschätzungen in einem Split Questionnaire Design unter Verwendung verschiedener Imputationsstrategien untersucht wird. Dabei zeigt sich, dass Ansätze, die die Imputation vereinfachen, zu guten Ergebnissen führen können (z.B. Predictive Mean Matching mit Dimensionsreduktion oder wenigen Prädiktorvariablen). Demgegenüber können insbesondere Generalisierte Lineare Modelle für kategoriale Daten, Klassifikationsbäume (CART) und Imputationsmodelle mit vielen Prädiktorvariablen starke Verzerrungen zur Folge haben

    Imputation of missing data from split questionnaire designs in social surveys

    Full text link
    Amidst the challenges of declining response rates and escalating costs in survey research, the adoption of innovative new data collection designs such as planned missingness and split questionnaire designs is becoming increasingly prevalent. This dissertation addresses the imputation of social survey data from split questionnaire designs and the methodological decisions associated with implementing such surveys to facilitate imputation. Through a series of Monte Carlo simulations, drawing on real social survey data from the German Internet Panel and the European Social Survey, this research assesses the accuracy of estimates across various scenarios, encompassing the implementation of both the split questionnaire design and the subsequent imputation. It delves into the impacts of different split questionnaire module construction strategies, varying imputation techniques, the interplay between planned missingness and conventional item nonresponse, and the implications of general-purpose versus analysis-specific imputation on the accuracy of estimates for a multivariate model. The insights gleaned from these simulations offer valuable guidance and recommendations for the implementation of split questionnaire designs in social surveys

    General-purpose imputation of planned missing data in social surveys: different strategies and their effect on correlations

    Get PDF
    Planned missing survey data, for example stemming from split questionnaire designs are becoming increasingly common in survey research, making imputation indispensable to obtain reasonably analyzable data. However, these data can be difficult to impute due to low correlations, many predictors, and limited sample sizes to support imputation models. This paper presents findings from a Monte Carlo simulation, in which we investigate the accuracy of correlations after multiple imputation using different imputation methods and predictor set specifications based on data from the German Internet Panel (GIP). The results show that strategies that simplify the imputation exercise (such as predictive mean matching with dimensionality reduction or restricted predictor sets, linear regression models, or the multivariate normal model without transformation) perform well, while especially generalized linear models for categorical data, classification trees, and imputation models with many predictor variables lead to strong biases.Geplant fehlende Werte in sozialwissenschaftlichen Befragungen, beispielsweise infolge eines Split Questionnaire Designs, treten in der Umfrageforschung immer häufiger auf. Um hinlänglich analysierbare Daten zu erhalten, ist hierbei oftmals eine Imputation erforderlich. Die statistische Modellierung bei der Imputation solcher Daten kann jedoch aufgrund niedriger Korrelationen, einer Großzahl möglicher Prädiktoren und begrenzter Stichprobengrößen mit enormen Herausforderungen verbunden sein. Der vorliegende Beitrag stellt Ergebnisse aus einer Monte-Carlo-Simulation vor, in der basierend auf Daten des German Internet Panels (GIP) die Validität von Korrelationsschätzungen in einem Split Questionnaire Design unter Verwendung verschiedener Imputationsstrategien untersucht wird. Dabei zeigt sich, dass Ansätze, die die Imputation vereinfachen, zu guten Ergebnissen führen können (z.B. Predictive Mean Matching mit Dimensionsreduktion oder wenigen Prädiktorvariablen). Demgegenüber können insbesondere Generalisierte Lineare Modelle für kategoriale Daten, Klassifikationsbäume (CART) und Imputationsmodelle mit vielen Prädiktorvariablen starke Verzerrungen zur Folge haben

    A data-driven approach to monitoring data collection in an online panel

    Get PDF
    Longitudinal or panel surveys suffer from panel attrition which may result in biased estimates. Online panels are no exceptions to this phenomenon, but offer great possibilities in monitoring and managing the data collection phase and response-enhancement features (e.g., reminders), due to real-time availability of paradata. This paper presents a data-driven approach to monitor the data collection phase and to inform the adjustment of response-enhancement features during data collection across online panel waves, which takes into account the characteristics of an ongoing panel wave. For this purpose, we study the evolution of the daily response proportion in each wave of a probability-based online panel. Using multilevel models, we predict the data collection evolution per wave day. In our example, the functional form of the data collection evolution is quintic. The characteristics affecting the shape of the data collection evolution are characteristics of the specific wave day and not of the panel wave itself. In addition, we simulate the monitoring of the daily response proportion of one panel wave and find that the timing of sending reminders could be adjusted after 20 consecutive panel waves to keep the data collection phase efficient. Our results demonstrate the importance of re-evaluating the characteristics of the data collection phase, such as the timing of reminders, across the lifetime of an online panel to keep the fieldwork efficient

    Analysis of RNA splicing defects in PITX2 mutants supports a gene dosage model of Axenfeld-Rieger syndrome

    Get PDF
    BACKGROUND: Axenfeld-Rieger syndrome (ARS) is associated with mutations in the PITX2 gene that encodes a homeobox transcription factor. Several intronic PITX2 mutations have been reported in Axenfeld-Rieger patients but their effects on gene expression have not been tested. METHODS: We present two new families with recurrent PITX2 intronic mutations and use PITX2c minigenes and transfected cells to address the hypothesis that intronic mutations effect RNA splicing. Three PITX2 mutations have been analyzed: a G>T mutation within the AG 3' splice site (ss) junction associated with exon 4 (IVS4-1G>T), a G>C mutation at position +5 of the 5' (ss) of exon 4 (IVS4+5G>C), and a previously reported A>G substitution at position -11 of 3'ss of exon 5 (IVS5-11A>G). RESULTS: Mutation IVS4+5G>C showed 71% retention of the intron between exons 4 and 5, and poorly expressed protein. Wild-type protein levels were proportionally expressed from correctly spliced mRNA. The G>T mutation within the exon 4 AG 3'ss junction shifted splicing exclusively to a new AG and resulted in a severely truncated, poorly expressed protein. Finally, the A>G substitution at position -11 of the 3'ss of exon 5 shifted splicing exclusively to a newly created upstream AG and resulted in generation of a protein with a truncated homeodomain. CONCLUSION: This is the first direct evidence to support aberrant RNA splicing as the mechanism underlying the disorder in some patients and suggests that the magnitude of the splicing defect may contribute to the variability of ARS phenotypes, in support of a gene dosage model of Axenfeld-Rieger syndrome

    Руководство по глазным болезням : (Lehrbuch der Augenheilkunde. Hrsg. von prof. dr. Th. Axenfeld. Mit 10 Farbentafeln und 437 zum grossen Teil mehrfarbigen Abbild. in Text G. Fischer. Jena. 1909)

    No full text
    Перевод с немецкого, с дополнениями и предисловием А.А. Иванова и Г.С. Канцел
    corecore