25 research outputs found

    Stationary set preserving L-forcings and the extender algebra

    Full text link
    Wir konstruieren das Jensensche L-Forcing und nutzen dieses um die Pi_2 Konsequenzen der Theorie ZFC+BMM+"das nichtstationäre Ideal auf omega_1 ist abschüssig" zu studieren. Viele natürliche Konsequenzen der Theorie ZFC+MM folgen schon aus dieser schwächeren Theorie. Wir geben eine neue Charakterisierung des Axioms Dagger ("Alle Forcings welche stationäre Teilmengen von omega_1 bewahren sind semiproper") in dem wir eine Klasse von L-Forcings isolieren deren Semiproperness äquivalent zu Dagger ist. Wir verallgemeinern ein Resultat von Todorcevic: wir zeigen, dass Rado's Conjecture Dagger impliziert. Des weiteren studieren wir Generizitätsiterationen im Kontext einer messbaren Woodinzahl. Mit diesem Werkzeug erhalten wir eine Verallgemeinerung des Woodinschen Sigma^2_1 Absolutheitstheorems. We review the construction of Jensen's L-forcing which we apply to study the Pi_2 consequences of the theory ZFC + BMM + "the nonstationary ideal on omega_1 is precipitous". Many natural consequences ZFC + MM follow from this weaker theory. We give a new characterization of the axiom dagger ("All stationary set preserving forcings are semiproper") by isolating a class of stationary set preserving L-forcings whose semiproperness is equivalent to dagger. This characterization is used to generalize work of Todorcevic: we show that Rado's Conjecture implies dagger. Furthermore we study genericity iterations beginning with a measurable Woodin cardinal. We obtain a generalization of Woodin's Sigma^2_1 absoluteness theorem

    Testing for Publication Bias in Diagnostic Meta-Analysis: A Simulation Study

    Full text link
    The present study investigates the performance of several statistical tests to detect publication bias in diagnostic meta-analysis by means of simulation. While bivariate models should be used to pool data from primary studies in diagnostic meta-analysis, univariate measures of diagnostic accuracy are preferable for the purpose of detecting publication bias. In contrast to earlier research, which focused solely on the diagnostic odds ratio or its logarithm (lnω\ln\omega), the tests are combined with four different univariate measures of diagnostic accuracy. For each combination of test and univariate measure, both type I error rate and statistical power are examined under diverse conditions. The results indicate that tests based on linear regression or rank correlation cannot be recommended in diagnostic meta-analysis, because type I error rates are either inflated or power is too low, irrespective of the applied univariate measure. In contrast, the combination of trim and fill and lnω\ln\omega has non-inflated or only slightly inflated type I error rates and medium to high power, even under extreme circumstances (at least when the number of studies per meta-analysis is large enough). Therefore, we recommend the application of trim and fill combined with lnω\ln\omega to detect funnel plot asymmetry in diagnostic meta-analysis. Please cite this paper as published in Statistics in Medicine (https://doi.org/10.1002/sim.6177).Comment: arXiv admin note: text overlap with arXiv:2002.04775 by other author

    Modellierung eines glatten Lernverlaufs und Testung individueller Abweichungen von einem globalen Verlauf

    Get PDF
    Formative assessment supplies valuable feedback for teachers and learners, and has been facilitated by computerized implementations. While longitudinal within-student assessment or within-class comparisons are useful, a normative interpretation of an individual’s course of learning can only be given relative to a reference population. As current computerized assessment systems sample items from pools or adapt tests, monitored students might work on non-overlapping item sets, so that classic sum scores cannot be compared directly. To meet this challenge, the Smooth Growth and Linear Deviations Rasch Model (SGLDRM) is introduced, an extension of Rasch’s item response theory model for binary test data. With the help of spline functions a smooth global course of learning is included. The model is flexible enough to accommodate increases and/or decreases of the mean ability level, which might be more or less pronounced at each measurement occasion. On the individual level, a random slope and a random intercept with amenable interpretations modify the global course of learning. Two measurement occasions suffice to estimate person-specific courses. A likelihood ratio test allows identifying students whose performance differs from the mean course. The methodology is illustrated with data from an online dyscalculia assessment and training. (DIPF/Orig.)Formatives Assessment liefert Lernenden und Lehrenden wertvolles Feedback und ist durch computergestützte Implementationen stark vereinfacht worden. Zwar sind längsschnittliche individuelle Assessments und Vergleiche innerhalb einer Klasse nützlich, aber normative Interpretationen von individuellen Lernverläufen können nur relativ zu einer Referenzpopulation gegeben werden. Da aktuelle computergestützte Assessment-Systeme Items aus Pools zufällig auswählen oder Tests adaptieren, arbeiten die Getesteten u. U. auf sich nicht überlappenden Itemmengen, wodurch klassische Summenscores nicht direkt vergleichbar sind. Um dem zu begegnen, wird das Smooth Growth and Linear Deviations Rasch Model (SGLDRM) eingeführt, eine Erweiterung des Rasch-Modells für binäre Testdaten aus der Item-Response-Theorie. Durch Splines wird ein glatter globaler Verlauf eingebunden. Das Modell ist flexibel genug, um Anstiege und Verringerungen des mittleren Fähigkeitsniveaus abzubilden, welche je nach Messzeitpunkt unterschiedlich stark ausgeprägt sein dürfen. Auf der individuellen Ebene wird der globale Lernverlauf durch gut interpretierbare zufällige Achsenabschnitte und Steigungen modifiziert. Zwei Messzeitpunkte reichen aus, um personenspezifische Verläufe zu schätzen. Ein Likelihood-Quotienten-Test erlaubt es, Lernende zu identifizieren, die vom mittleren Lernverlauf abweichen. Die Methode wird anhand von Daten aus einem Online-System zur Diagnostik und Behandlung von Dyskalkulie illustriert. (DIPF/Orig.

    Optimal design of the Wilcoxon-Mann-Whitney-test

    Full text link
    In scientific research, many hypotheses relate to the comparison of two independent groups. Usually, it is of interest to use a design (i.e., the allocation of sample sizes mm and nn for fixed N=m+nN = m + n) that maximizes the power of the applied statistical test. It is known that the two-sample t-tests for homogeneous and heterogeneous variances may lose substantial power when variances are unequal but equally large samples are used. We demonstrate that this is not the case for the non-parametric Wilcoxon-Mann-Whitney-test, whose application in biometrical research fields is motivated by two examples from cancer research. We prove the optimality of the design m=nm = n in case of symmetric and identically shaped distributions using normal approximations and show that this design generally offers power only negligibly lower than the optimal design for a wide range of distributions. Please cite this paper as published in the Biometrical Journal (https://doi.org/10.1002/bimj.201600022)

    Verlaufsdiagnostik arithmetischer Grundkompetenzen. Messen verschiedene Booklets die gleiche Fähigkeit?

    Get PDF
    An important prerequisite of progress monitoring as one source to support instructional decision-making is the existence of equivalent booklets. This study assesses this prerequisite with respect to a German elementary school math curriculum-based measurement instrument (LVD-M 2-4; Strathmann & Klauer, 2012). Every second week of a 19-weeks period, n = 108 third and n = 109 fourth graders (regular instruction) completed one of ten parallel booklets, each containing 24 arithmetic tasks. Analyses with (generalized) linear mixed models showed that in both grades the between-booklet variance was so small in relation to the between student variance that it was practically irrelevant. This corresponds to the key assumption of the binomial model that equivalent scores from different booklets reflect the same ability. While item difficulty varied within some of the tasks, the effect was insubstantial in comparison with the variance between students. These findings were replicated in two intervention samples of an RTI study. The parallel booklets can therefore be regarded as equivalent for typical applied purposes. Implications of these findings for curriculum-based measurement and booklet design are discussed. An important prerequisite of progress monitoring as one source to support instructional decision-making is the existence of equivalent booklets. This study assesses this prerequisite with respect to a German elementary school math curriculum-based measurement instrument (LVD-M 2-4; Strathmann & Klauer, 2012). Every second week of a 19-weeks period, n = 108 third and n = 109 fourth graders (regular instruction) completed one of ten parallel booklets, each containing 24 arithmetic tasks. Analyses with (generalized) linear mixed models showed that in both grades the between-booklet variance was so small in relation to the between student variance that it was practically irrelevant. This corresponds to the key assumption of the binomial model that equivalent scores from different booklets reflect the same ability. While item difficulty varied within some of the tasks, the effect was insubstantial in comparison with the variance between students. These findings were replicated in two intervention samples of an RTI study. The parallel booklets can therefore be regarded as equivalent for typical applied purposes. Implications of these findings for curriculum-based measurement and booklet design are discussed. (DIPF/Orig.)Eine wichtige Voraussetzung dafür, Lernverlaufsdiagnostik für instruktionale Entscheidungen nutzen zu können, sind äquivalente Testbooklets. Diese Studie prüft diese Voraussetzung für die „Lernverlaufsdiagnostik – Mathematik 2-4“ (LVD-M 2-4; Strathmann & Klauer, 2012). Über 19 Wochen hinweg bearbeiteten n = 108 Drittklässler:innen und n = 109 Viertklässler:innen in zweiwöchigem Abstand zehn verschiedene Paralleltests mit je 24 arithmetischen Aufgaben. Mithilfe (generalisierter) gemischter linearer Modelle wurden Booklet-Effekte in Relation zur Leistungsvarianz zwischen den Kindern gesetzt. Damit wurde die Kernannahme des Binomial-Modells geprüft, dass gleiche Scores aus verschiedenen zufallsgenerierten Booklets die gleiche latente Fähigkeit abbilden sollten. In beiden Klassenstufen fiel die Between-Booklet-Varianz in Relation zur Varianz zwischen den Kindern sehr gering aus. Für einige Aufgabentypen variierte die Schwierigkeit zwar zwischen den Booklets, war verglichen mit der Varianz der Schülerleistung aber nicht substanziell. Die Befunde ließen sich in zwei Interventionsgruppen einer RTI-Studie replizieren. Die Booklets können also für typische Anwendungszwecke als äquivalent angesehen werden. Die Implikationen dieser Befunde werden vor dem Hintergrund von Lernverlaufsdiagnostik und der Konstruktion äquivalenter Testbooklets diskutiert. (DIPF/Orig.

    We are going on a gold coin hunt: Psychometric properties of different scorings in computer-based progress monitoring of mathematics ability

    Full text link
    In diesem Beitrag wird der computergestützte Lernverlaufstest „Goldmünzenjagd“ vorgestellt, der in ein Online-Training für Kinder mit Rechenschwierigkeiten eingebettet ist. Der nach dem robust indicator-Ansatz konstruierte Test bildet den Lernfortschritt in zwei wichtigen mathematischen Basiskompetenzen ab: dem arithmetischen Faktenwissen (Addition bzw. Subtraktion bis 20) und dem Zahlenordnen (Zahlenreihen mit drei Elementen bis 100). Mit einem High speed, high stakes-Scoring wird die Bearbeitungseffizienz bewertet. Dieses Scoring verknüpft Geschwindigkeit und Präzision zu einem Gewinn oder Verlust von Goldmünzen auf Itemebene und zeigte sich in einer Feldstudie mit N = 241 Grundschulkindern (Klassenstufe 2 bis 4) sowohl in der Reliabilität (r = .87-.93) als auch in der Kriteriumsvalidität (r = .51) den klassischen Geschwindigkeits- und Präzisions-Scorings überlegen. Die individuellen Ergebnisse in den Lernverlaufstests waren zudem änderungssensitiv für die statusdiagnostische Entwicklung der Kinder: Für alle drei untersuchten Scorings ergab sich eine inkrementelle Varianzaufklärung der Leistung nach dem Training durch Parameter individueller Lernverläufe (random intercept: Ausgangspunkt Lernverlaufstest, random slope: Zuwachs Lernverlaufstest). Der vorgestellte Lernverlaufstest eignet sich damit als reliables und valides Tool zur formativen Evaluation der Leistungsentwicklung von Grundschulkindern in basalen mathematischen Kompetenzbereichen. Insbesondere für rechenschwache Kinder bietet das Goldmünzen-Scoring eine direkt ersichtliche Anreizstruktur, die schlechter Performanz aufgrund von Motivationsdefiziten vorbeugen kann, sowie die Entwicklung von zählenden hin zu abrufbasierten Rechenstrategien fördert. Aus diesen Gründen ist auch eine Implementation des Verfahrens in den inklusiven Unterricht denkbar. (DIPF/Orig.)Based on the robust indicator approach, a new progress monitoring instrument was developed and embedded into an online training for children with mathematical learning difficulties. The test captures the development in two basic mathematical competences: arithmetic fact knowledge (addition and subtraction up to 20) and numerical order processing (up to 100). According to the “high speed, high stakes” principle, speed and precision performance are combined into a single efficiency score on item level, expressed as the earnings or losses of gold coins. In a field study with primary school children (N = 241, grades 2 to 4), this new efficiency scoring showed both a higher reliability (r = .87-.93) and criterion validity (r = .51) than simple speed or precision scorings. Moreover, individual results in the progress monitoring test were sensitive to the sample’s performance gains related to the training: For all three scorings, parameters of individual progress trajectories (random intercepts and random slopes) were predictive for post-training performance. Taken together, the new progress monitoring test qualifies as a reliable and valid tool for the formative assessment of primary school children’s learning progress in basic mathematical abilities. Especially for low achieving children, the gold coin scoring offers an attractive incentive that should prevent low performance due to low motivation and foster the utilization of retrieval-based solution strategies. Hence, the test and training system could be implemented into remedial classroom practice. (DIPF/Orig.

    A randomized waitlist-controlled trial comparing detached mindfulness and cognitive restructuring in obsessive-compulsive disorder.

    No full text
    ObjectiveWhereas research has demonstrated the efficacy of cognitive restructuring (CR) for obsessive-compulsive disorder (OCD), little is known about the efficacy of specific metacognitive interventions such as detached mindfulness (DM). Therefore, this study compared the efficacy of CR and DM as stand-alone interventions.DesignWe conducted a randomized waitlist-controlled trial. n = 43 participants were randomly assigned to either DM or CR. Out of those participants, n = 21 participants had been previously assigned to a two-week waitlist condition.Materials and methodsIn both conditions, treatment comprised four double sessions within two weeks. Assessment took place at baseline (Pre1), after treatment (Post) and four weeks after the end of treatment (FU). There was a second baseline assessment (Pre2) in the waitlist group. Independent evaluators were blinded concerning the active condition. Adherence and competence ratings for the two therapists were obtained from an independent rater.Results40 patients completed the treatment. Two patients dropped out because of exacerbated depression. There were no further adverse events. Both CR and DM were shown to be superior to waitlist and equally effective at reducing OCD symptoms from pre to post assessment as measured with the Y-BOCS (CR: d = 1.67, DM: d = 1.55). In each of the two treatment conditions, eight patients (40%) exhibited a clinical significant change at post assessment.ConclusionsThe results of this clinical trial suggest the potential efficacy of DM as a stand-alone intervention for OCD, however, our findings need to be interpreted with caution. Results indicate that both CR and DM should be considered as possible alternative treatments for OCD, whereas the working mechanisms of DM have yet to be elucidated
    corecore