1,499 research outputs found
Rasch Analysis of Relational Well-Being Within the National Survey of Adoptive Parents of 2007: A Comparison of Multidimensional, Consecutive, and Unidimensional Approaches to Measure Construction
A unidimensional Rasch approach was used to explore whether the data collected through the National Survey of Adoptive Parents of 2007 (NSAP) for the well-being items represented a single latent construct and to establish a base model for comparison. A consecutive approach was then used as an exploratory tool to draw out potential multiple dimensions. Finally, multidimensional item response theory (MIRT) was used to confirm the results of the consecutive approach findings while comparing with the unidimensional baseline. Items within the survey were evaluated for scale function as well as invariance.
The comparison of three approaches (unidimensional, combined consecutive, and 2-dimensional MIRT) found that the combination of Consecutive Dimensions A and B yielded the best fitting model for these data sets. The nested 2-dimensional MIRT model showed better fit than the unidimensional model, but concerns with item position and inconsistent error terms supported the combined consecutive model.
The use of IRT and MIRT analysis techniques helped strengthen the survey by identifying items within the survey that relate to identified constructs. The comparison of three approaches provides practitioners with an example of how to use a consecutive approach in Rasch for exploratory purposes when dimensionality has not already been established.
The NSAP survey was developed to gather data from a large cross-section of adoptive parents in the United States. The well-being subsection gathered data on the parent-child relationship with the intent to assist adoption practitioners, policy-makers, and researchers. Since only twelve of the thirty-nine items were utilized within the models, the data collection opportunity was not fully captured. This lost opportunity of data collection supported the idea of survey development partnerships between topic content experts and psychometricians, when building measures, to maximize the effectiveness of the tool as well as the data gathered
Design and Validation of a Novel Tool to Assess Citizens’ Netiquette and Information and Data Literacy Using Interactive Simulations
Until recently, most of the digital literacy frameworks have been based on assessment frameworks used by commercial entities. The release of the DigComp framework has allowed the development of tailored implementations for the evaluation of digital competence. However, the majority of these digital literacy frameworks are based on self-assessments, measuring only low-order cognitive skills. This paper reports on a study to develop and validate an assessment instrument, including interactive simulations to assess citizens’ digital competence. These formats are particularly important for the evaluation of complex cognitive constructs such as digital competence. Additionally, we selected two different approaches for designing the tests based on their scope, at the competence or competence area level. Their overall and dimensional validity and reliability were analysed. We summarise the issues addressed in each phase and key points to consider in new implementations. For both approaches, items present satisfactory difficulty and discrimination indicators. Validity was ensured through expert validation, and the Rasch analysis revealed good EAP/PV reliabilities. Therefore, the tests have sound psychometric properties that make them reliable and valid instruments for measuring digital competence. This paper contributes to an increasing number of tools designed to evaluate digital competence and highlights the necessity of measuring higher-order cognitive skills.This research received no external fundin
Multi-Criteria Evaluation in Support of the Decision-Making Process in Highway Construction Projects
The decision-making process in highway construction projects identifies and selects the optimal alternative based on the user requirements and evaluation criteria. The current practice of the decision-making process does not consider all construction impacts in an integrated decision-making process. This dissertation developed a multi-criteria evaluation framework to support the decision-making process in highway construction projects. In addition to the construction cost and mobility impacts, reliability, safety, and emission impacts are assessed at different evaluation levels and used as inputs to the decision-making process.
Two levels of analysis, referred to as the planning level and operation level, are proposed in this research to provide input to a Multi-Criteria Decision-Making (MCDM) process that considers user prioritization of the assessed criteria. The planning level analysis provides faster and less detailed assessments of the inputs to the MCDM utilizing analytical tools, mainly in a spreadsheet format. The second level of analysis produces more detailed inputs to the MCDM and utilizes a combination of mesoscopic simulation-based dynamic traffic assignment tool, and microscopic simulation tool, combined with other utilities.
The outputs generated from the two levels of analysis are used as inputs to a decision-making process based on present worth analysis and the Fuzzy TOPSIS (Technique for Order Preference by Similarity to Ideal Situation) MCDM method and the results are compared
An Examination of Parameter Recovery Using Different Multiple Matrix Booklet Designs
Educational large-scale assessments examine students’ achievement in various content
domains and thus provide key findings to inform educational research and evidence-based
educational policies. To this end, large-scale assessments involve hundreds of items to test
students’ achievement in various content domains. Administering all these items to single
students will over-burden them, reduce participation rates, and consume too much time and
resources. Hence multiple matrix sampling is used in which the test items are distributed into
various test forms called “booklets”; and each student administered a booklet, containing a
subset of items that can sensibly be answered during the allotted test timeframe. However,
there are numerous possibilities as to how these booklets can be designed, and this manner of booklet design could influence parameter recovery precision both at global and subpopulation levels. One popular booklet design with many desirable characteristics is the
Balanced Incomplete 7-Block or Youden squares design. Extensions of this booklet design
are used in many large-scale assessments like TIMSS and PISA. This doctoral project
examines the degree to which item and population parameters are recovered in real and
simulated data in relation to matrix sparseness, when using various balanced incomplete
block booklet designs. To this end, key factors (e.g., number of items, number of persons,
number of items per person, and the match between the distributions of item and person
parameters) are experimentally manipulated to learn how these factors affect the precision
with which these designs recover true population parameters. In doing so, the project expands
the empirical knowledge base on the statistical properties of booklet designs, which in turn
could help improve the design of future large-scale studies.
Generally, the results show that for a typical large-scale assessment (with a sample size of at
least 3,000 students and more than 100 test items), population and item parameters are recovered accurately and without bias in the various multi-matrix booklet designs. This is
true both at the global population level and at the subgroup or sub-population levels. Further,
for such a large-scale assessment, the match between the distribution of person abilities and
the distribution of item difficulties is found to have an insignificant effect on the precision
with which person and item parameters are recovered, when using these multi-matrix booklet
designs.
These results give further support to the use of multi-matrix booklet designs as a reliable test
abridgment technique in large-scale assessments, and for accurate measurement of
performance gaps between policy-relevant subgroups within populations. However, item position effects were not fully considered, and different results are possible if similar studies
are performed (a) with conditions involving items that poorly measure student abilities (e.g.,
with students having skewed ability distributions); or, (b) simulating conditions where there
is a lot of missing data because of non-response, instead of just missing by design. This
should be further investigated in future studies.Die Erfassung des Leistungsstands von Schülerinnen und Schülern in verschiedenen
Domänen durch groß angelegte Schulleistungsstudien (sog. Large-Scale Assessments) liefert
wichtige Erkenntnisse für die Bildungsforschung und die evidenzbasierte Bildungspolitik.
Jedoch erfordert die Leistungstestung in vielen Themenbereichen auch immer den Einsatz
hunderter Items. Würden alle Testaufgaben jeder einzelnen Schülerin bzw. jedem einzelnen
Schüler vorgelegt werden, würde dies eine zu große Belastung für die Schülerinnen und
Schüler darstellen und folglich wären diese auch weniger motiviert, alle Aufgaben zu
bearbeiten. Zudem wäre der Einsatz aller Aufgaben in der gesamten Stichprobe sehr zeit- und
ressourcenintensiv. Aus diesen Gründen wird in Large-Scale Assessments oft auf ein Multi-
Matrix Design zurückgegriffen bei dem verschiedene, den Testpersonen zufällig zugeordnete,
Testheftversionen (sog. Booklets) zum Einsatz kommen. Diese enthalten nicht alle Aufgaben,
sondern lediglich eine Teilmenge des Aufgabenpools, wobei nur ein Teil der Items zwischen
den verschiedenen Booklets überlappt. Somit wird sichergestellt, dass die Schülerinnen und
Schüler alle ihnen vorgelegten Items in der vorgegebenen Testzeit bearbeiten können. Jedoch
gibt es zahlreiche Varianten wie diese Booklets zusammengestellt werden können. Das
jeweilige Booklet Design hat wiederum Auswirkungen auf die Genauigkeit der
Parameterschätzung auf Populations- und Teilpopulationsebene. Ein bewährtes Booklet
Design ist das Balanced-Incomplete-7-Block Design, auch Youden-Squares Design genannt,
das in unterschiedlicher Form in vielen Large-Scale Assessments, wie z.B. TIMSS und PISA,
Anwendung findet. Die vorliegende Arbeit untersucht sowohl auf Basis realer als auch
simulierter Daten die Genauigkeit mit der Item- und Personenparameter unter Anwendung
verschiedener Balanced-Incomplete-Block Designs und in Abhängigkeit vom Anteil
designbedingt fehlender Werte geschätzt werden können. Dafür wurden verschiede
Designparameter variiert (z.B. Itemanzahl, Stichprobenumfang, Itemanzahl pro Booklet,
Ausmaß der Passung von Item- und Personenparametern) und anschließend analysiert, in
welcher Weise diese die Genauigkeit der Schätzung von Populationsparametern beeinflussen. Die vorliegende Arbeit hat somit zum Ziel, das empirische Wissen um die statistischen Eigenschaften von Booklet Designs zu erweitern, wodurch ein Beitrag zur Verbesserung zukünftiger Large-Scale Assessments geleistet wird.
Die Ergebnisse der vorliegenden Arbeit zeigten, dass für ein typisches Large-Scale
Assessment (mit einer Stichprobengröße von mindestens 3000 Schülerinnen und Schülern
und mindestens 100 Items) die Personen- und Itemparameter sowohl auf Populations- als
auch auf Teilpopulationsebene mit allen eingesetzten Varianten des Balanced-Incomplete-
Block Designs präzise geschätzt wurden. Außerdem konnte gezeigt werden, dass für
Stichproben mit mindestens 3000 Schülerinnen und Schülern die Passung zwischen der
Leistungsverteilung und der Verteilung der Aufgabenschwierigkeit keinen bedeutsamen
Einfluss auf die Genauigkeit hatte, mit der verschiedene Booklet Designs Personen- und
Itemparameter schätzten.
Die Ergebnisse untermauern, dass unter Verwendung von multi-matrix Designs
bildungspolitisch relevante Leistungsunterschiede zwischen Gruppen von Schülerinnen und
Schülern in der Population reliabel und präzise geschätzt werden können. Eine
Einschränkung der vorliegenden Studie liegt darin, dass Itempositionseffekte nicht umfassend
berücksichtigt wurden. So kann nicht ausgeschlossen werden, dass die Ergebnisse abweichen würden, wenn (a) Items verwendet werden würden, welche die Leistung der Schülerinnen und Schüler schlecht schätzen (z.B. bei einer schiefen Verteilungen der Leistungswerte) oder (b) hohe Anteile an fehlenden Werten vorliegen, die nicht durch das Multi-Matrix Design erzeugt wurden. Dies sollte in zukünftigen Studien untersucht werden
Teacher Merit Pay In A Rural Western North Carolina County: A Quantitative Analysis Of The Effects Of Student Characteristics On A Teacher’s Likelihood Of Receiving A Monetary Bonus In Math Or Reading In Grades Three - Eight
This quantitative work is an exploratory study examining the North Carolina bonus pay structure enacted in 2016-2017 for math and reading teachers in grades three through eight. The study used student data from a rural western North Carolina county from the 2017-2018 academic year. It analyzed the validity of the EVAAS tool used in North Carolina to identify the top 25% of teachers in the affected grade levels and subject areas. The study examined student characteristics to find correlations between teachers receiving the merit-based bonus and the composition of students in her classroom. The study identified that white, mixed-race and Asian students have a greater likelihood of sitting in a teacher’s classroom that received the reading bonus. Students with disabilities had a negative correlation to a teacher’s likelihood of receiving the reading bonus in grades three - five. In math, the study found a negative correlation between teachers receiving the bonus and students with disabilities and students identified as gifted learners. There was a positive correlation between math teacher bonuses and mixed-race, Hispanic and African-American students. The study might be used to help inform student classroom assignment practices in North Carolina
A Statistical Approach to the Alignment of fMRI Data
Multi-subject functional Magnetic Resonance Image studies are critical. The anatomical and functional structure varies across subjects, so the image alignment is necessary. We define a probabilistic model to describe functional alignment. Imposing a prior distribution, as the matrix Fisher Von Mises distribution, of the orthogonal transformation parameter, the anatomical information is embedded in the estimation of the parameters, i.e., penalizing the combination of spatially distant voxels. Real applications show an improvement in the classification and interpretability of the results compared to various functional alignment methods
A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium
When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available
- …