3,046 research outputs found

    Integrating knowledge tracing and item response theory: A tale of two frameworks

    Get PDF
    Traditionally, the assessment and learning science commu-nities rely on different paradigms to model student performance. The assessment community uses Item Response Theory which allows modeling different student abilities and problem difficulties, while the learning science community uses Knowledge Tracing, which captures skill acquisition. These two paradigms are complementary - IRT cannot be used to model student learning, while Knowledge Tracing assumes all students and problems are the same. Recently, two highly related models based on a principled synthesis of IRT and Knowledge Tracing were introduced. However, these two models were evaluated on different data sets, using different evaluation metrics and with different ways of splitting the data into training and testing sets. In this paper we reconcile the models' results by presenting a unified view of the two models, and by evaluating the models under a common evaluation metric. We find that both models are equivalent and only differ in their training procedure. Our results show that the combined IRT and Knowledge Tracing models offer the best of assessment and learning sciences - high prediction accuracy like the IRT model, and the ability to model student learning like Knowledge Tracing

    Towards Interpretable Deep Learning Models for Knowledge Tracing

    Full text link
    As an important technique for modeling the knowledge states of learners, the traditional knowledge tracing (KT) models have been widely used to support intelligent tutoring systems and MOOC platforms. Driven by the fast advancements of deep learning techniques, deep neural network has been recently adopted to design new KT models for achieving better prediction performance. However, the lack of interpretability of these models has painfully impeded their practical applications, as their outputs and working mechanisms suffer from the intransparent decision process and complex inner structures. We thus propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models. Specifically, we focus on applying the layer-wise relevance propagation (LRP) method to interpret RNN-based DLKT model by backpropagating the relevance from the model's output layer to its input layer. The experiment results show the feasibility using the LRP method for interpreting the DLKT model's predictions, and partially validate the computed relevance scores from both question level and concept level. We believe it can be a solid step towards fully interpreting the DLKT models and promote their practical applications in the education domain

    EDM 2011: 4th international conference on educational data mining : Eindhoven, July 6-8, 2011 : proceedings

    Get PDF

    Psychometrics in Practice at RCEC

    Get PDF
    A broad range of topics is dealt with in this volume: from combining the psychometric generalizability and item response theories to the ideas for an integrated formative use of data-driven decision making, assessment for learning and diagnostic testing. A number of chapters pay attention to computerized (adaptive) and classification testing. Other chapters treat the quality of testing in a general sense, but for topics like maintaining standards or the testing of writing ability, the quality of testing is dealt with more specifically.\ud All authors are connected to RCEC as researchers. They present one of their current research topics and provide some insight into the focus of RCEC. The selection of the topics and the editing intends that the book should be of special interest to educational researchers, psychometricians and practitioners in educational assessment

    IRT-Based Adaptive Hints to Scaffold Learning in Programming

    Get PDF
    Over the past few decades, many studies conducted in the field of learning science have described that scaffolding plays an important role in human learning. To scaffold a learner efficiently, a teacher should predict how much support a learner must have to complete tasks and then decide the optimal degree of assistance to support the learner\u27s development. Nevertheless, it is difficult to ascertain the optimal degree of assistance for learner development. For this study, it is assumed that optimal scaffolding is based on a probabilistic decision rule: Given a teacher\u27s assistance to facilitate the learner development, an optimal probability exists for a learner to solve a task. To ascertain that optimal probability, we developed a scaffolding system that provides adaptive hints to adjust the predictive probability of the learner\u27s successful performance to the previously determined certain value, using a probabilistic model, i.e., item response theory (IRT). Furthermore, using the scaffolding system, we compared learning performances by changing the predictive probability. Results show that scaffolding to achieve 0.5 learner success probability provides the best performance. Additionally, results demonstrate that a scaffolding system providing 0.5 probability decreases the number of hints (amount of support) automatically as a fading function according to the learner\u27s growth capability

    New measurement paradigms

    Get PDF
    This collection of New Measurement Paradigms papers represents a snapshot of the variety of measurement methods in use at the time of writing across several projects funded by the National Science Foundation (US) through its REESE and DR K–12 programs. All of the projects are developing and testing intelligent learning environments that seek to carefully measure and promote student learning, and the purpose of this collection of papers is to describe and illustrate the use of several measurement methods employed to achieve this. The papers are deliberately short because they are designed to introduce the methods in use and not to be a textbook chapter on each method. The New Measurement Paradigms collection is designed to serve as a reference point for researchers who are working in projects that are creating e-learning environments in which there is a need to make judgments about students’ levels of knowledge and skills, or for those interested in this but who have not yet delved into these methods

    Bridging Mathematics with Word Problems

    Get PDF
    The aim of this thesis was to explore several important aspects of word problems: the nature of word problems used in school mathematics textbooks and the difficulty level of different types of word problems. The specific goals were to investigate students’ performance when solving various types of word problems and to determine whether students’ word-problem skills and their beliefs about word problem-solving can be improved by enriching word problems used in mathematics teaching. To achieve the goals, this thesis reports on five original studies, as follows. Study I showed a comparison between the characteristics of word problems presented in Thai and Finnish school mathematics textbooks. The analyses included 1,565 word problems from a series of second- to fourth-grade Thai and Finnish mathematics textbooks. The overall results show that the nature of word problems used in Finnish textbooks vary from Thai textbooks in many ways. Finnish textbooks contain more multistep word problems, while in Thai textbooks, one-step word problems appear more frequently. Thai textbooks have a smaller percentage of repetitive sections (ones that include only the same type of problems) than Finnish textbooks. In both countries, the percentage of word problems requiring the use of realistic considerations is extremely low, less than five percent of the total. Studies II and III presented the impacts of a Word Problem Enrichment (WPE) programme, developed to encourage teachers to use innovative self-created word problems to improve student mathematical modelling and problem-solving skills. Participants comprised 10 classroom teachers and their 170 students from fourth and sixth grades, from elementary schools in southwest Finland. In Study II, the intervention effectiveness on student problem-solving performance was investigated. The results suggested that enriching word problems used in mathematics teaching is a promising method for improving student problem-solving skills when solving non-routine and application word problems. However, it is not known if WPE has an effect on student beliefs about word problem-solving, and how the programme works for students with different initial motivation in learning mathematics. Study III examined the effectiveness of WPE on student beliefs about word problem-solving by using latent profile analysis (LPA) and structural equation modelling (SEM) to analyse relationships among the different cognitive, motivation, and belief factors. Results indicated that the impacts of WPE are various depending upon the initial motivation level of students. The effects of WPE on student beliefs appeared only in students with a low initial motivation level, while its impacts on student problem-solving performance were found only in students with a high initial motivation level. Studies IV and V were conducted to examine hypotheses regarding (1) the dimensionality of students’ performance on word problems and (2) difficulty level of three types of word problems: routine, non-routine and application word problems by utilizing item response theory (IRT) modelling. The data used in Study IV was collectedas part of the Word Problem project (Studies II and III). Participants comprised 170 fourth- and sixth-grade students. Students’ problem-solving performance was assessed with a word problem-solving test, including five word problems: one routine, three non-routine, and one application. The results of Study IV show that students’ performance on word problems can be seen as a unidimensional construct that denies the original assumption. The results of the IRT model indicate that the theoretically demanding application word problem has a higher difficulty level than non-routine and routine word problems. Nevertheless, the results are obscure if this application word problem (used in Study IV) is harder because of its demand for realistic considerations or other possibly relevant factors (e.g. decimal numbers included, division, more problem-solving steps required). Moreover, the sample size of Study IV could be considered relatively small for this kind of complicated IRT model. Therefore, Study V uses a larger sample size and a bigger set of word problems with more variety in application and non-routine word problems. The data used in Study V was collected as part of the Quest for Meaning project. Participants comprised 891 fourth-grade students (446 boys and 445 girls) from different elementary schools situated in cities, small towns, and rural communities in southern Finland. On the same lines as Study IV, the results of Study V indicated that students’ performance on word problems can be seen as a unidimensional construct. Concerning item difficulty level, the results of the IRT model do not show a clear distinction among word-problem types and reject the hypothesis that application word problems have a higher difficulty level than non-routine word problems. Some non-routine word problems appear to be more difficult than the application word problem, even though other characteristics of these two types of word problems were very similar (e.g., they required the same type of operation and the same number of problem-solving steps). The results of the five studies reveal that even though the mathematics textbooks were highly regarded in Thailand and Finland, most given word problems frequently include a simple goal without demanding any realistic considerations. These results strongly suggest that more innovative application word problems are definitely needed in classroom mathematics. In our study, we developed the WPE to encourage teachers to develop their own meaningful non-routine and applications word problems, and to use these self-created word problems to improve mathematical modelling and students’ word problem-solving performance. The results show that WPE is a promising approach to improve not only student problem-solving skills but also student beliefs about word problem-solving. The impacts of WPE are different depending upon students’ initial motivation level. The impacts of WPE on student beliefs were found only in students with a low initial motivation level, while its impacts on student problem-solving performance were found only in students with a high initial motivation level. These results suggest that in classroom practice, it is important that teachers provide enough support for students to be more confident and feel less overwhelmed when facing non-routine and application word problems. Teachers should be aware of differences of word-problem types and utilise this information in planning how to scaffold students’ word problem-solving by giving word problems based on their difficulty level.VĂ€itöskirjatyö kohdistuu matematiikan sanallisten tehtĂ€vien tĂ€rkeisiin ominaisuuksiin: koulumatematiikassa hyödynnettĂ€vien sanallisten tehtĂ€vien luonteen sekĂ€ erityyppisten sanallisten tehtĂ€vien vaikeustason tarkasteluun. KeskeisinĂ€ tavoitteina oli tarkastella oppilaiden suoriutumista heidĂ€n ratkaistessaan erityyppisiĂ€ sanallisia tehtĂ€viĂ€ ja selvittÀÀ, voidaanko oppilaiden sanallisten tehtĂ€vien ratkaisutaitoja ja heidĂ€n uskomuksiaan sanallisten tehtĂ€vien ratkaisuun liittyen parantaa rikastamalla matematiikan opetuksessa kĂ€ytettĂ€viĂ€ sanallisia tehtĂ€viĂ€. NĂ€iden tavoitteiden saavuttamiseksi tĂ€ssĂ€ vĂ€itöstutkimuksessa toteutettiin viisi osatutkimusta. Osatutkimuksessa I vertailtiin suomalaisissa ja thaimaalaisissa matematiikan oppikirjoissa kĂ€ytettĂ€vien sanallisten tehtĂ€vien ominaisuuksia. Tutkimuksessa analysoitiin 1565 sanallista tehtĂ€vÀÀ suomalaisista ja thaimaalaisista eri oppikirjasarjojen toisen–neljĂ€nnen luokan matematiikan oppikirjoista. Tulokset osoittivat, ettĂ€ suomalaisissa oppikirjoissa esiintyvĂ€t sanalliset tehtĂ€vĂ€t eroavat monin tavoin Thaimaassa kĂ€ytössĂ€ olevien oppikirjojen tehtĂ€vistĂ€. Suomalaisissa oppikirjoissa on enemmĂ€n useita vĂ€livaiheita sisĂ€ltĂ€viĂ€ sanallisia tehtĂ€viĂ€, kun taas thaimaalaisissa oppikirjoissa esiintyy enemmĂ€n yksivaiheisia sanallisia tehtĂ€viĂ€. Thaimaalaisissa oppikirjoissa on prosentuaalisesti vĂ€hemmĂ€n toistavia osioita (sisĂ€ltĂ€vĂ€t ainoastaan tietyn tyyppisiĂ€ tehtĂ€viĂ€) kuin suomalaisissa oppikirjoissa. Molempien vertailtavien maiden oppikirjoissa sellaisten tehtĂ€vien osuus, joiden ratkaiseminen vaatii todellisten arkielĂ€mĂ€n nĂ€kökohtien huomioimista, on todella vĂ€hĂ€inen, vain noin viisi prosenttia kaikista sanallisista tehtĂ€vistĂ€. Osatutkimukset II ja III esittelivĂ€t niin sanotun Sanallisten TehtĂ€vien Rikastaminen (STR) –ohjelman vaikutuksia, joka kehitettiin tarkoituksena rohkaista opettajia hyödyntĂ€mÀÀn opetuksessaan innovatiivisia, itse kehittelemiÀÀn sanallisia ongelmia parantamaan oppilaiden matemaattisen mallintamisen ja ongelmanratkaisun taitoja. Tutkittavina oli 10 luokanopettajaa ja heidĂ€n 170 oppilastaan neljĂ€nneltĂ€ ja kuudennelta luokalta varsinaissuomalaisista kouluista. Osatutkimuksessa II selvitettiin intervention vaikuttavuutta suhteessa oppilaiden ongelmanratkaisutaitoihin. Tulokset osoittivat, ettĂ€ matematiikan opetuksessa sanallisten tehtĂ€vien rikastaminen on lupaava menetelmĂ€ oppilaiden ongelmanratkaisutaitojen parantamiseksi, kun ratkaistaan ei-rutiininomaisia ja soveltamista vaativia sanallisia ongelmia. TĂ€ssĂ€ osatutkimuksessa jĂ€i kuitenkin vielĂ€ epĂ€selvĂ€ksi, onko STR:llĂ€ vaikutusta oppilaiden uskomuksiin sanallisten ongelmanratkaisutehtĂ€vien ratkaisua kohtaan ja kuinka ohjelma vaikuttaa erilaisen motivaation matematiikan opiskelua kohtaan omaavien oppilaiden oppimiseen. Osatutkimuksessa III selvitettiin STR-ohjelman vaikuttavuutta oppilaiden uskomuksiin sanallisiin ongelmanratkaisutehtĂ€viin liittyen hyödyntĂ€en latenttia profiilianalyysia (LPA) ja rakenneyhtĂ€lömallinnusta (structural equation modelling, SEM), joiden avulla analysoitiin erilaisten kognitiivisten, motivationaalisten ja uskomuksiin liittyvien tekijöiden vĂ€lisiĂ€ suhteita. Tulokset indikoivat, ettĂ€ STR-ohjelman vaikutukset ovat erilaisia riippuen oppilaiden motivaatiotasosta matematiikan opiskelua kohtaan. STR:n vaikutukset uskomuksiin nĂ€kyivĂ€t ainoastaan niiden oppilaiden kohdalla, joilla oli alhainen motivaatio, kun taas ohjelmalla oli vaikutuksia ongelmanratkaisutaitojen tasoon vain sellaisten oppilaiden osalta, joiden motivaatio oli korkea. Osatutkimuksissa IV ja V selvitettiin (1) sijoittuvatko oppilaiden suoritukset sanallisissa tehtĂ€vissĂ€ yhdelle vaikeusdimensiolle vai onko sanallisten tehtĂ€vien vaikeudessa eri dimensioita ja (2) kolmen tyyppisten sanallisten tehtĂ€vien (rutiininomaisetn, ei-rutiininomaiset ja soveltamista vaativat tehtĂ€vĂ€t) vaikeustasoa hyödyntĂ€mĂ€llĂ€ modernia testiosioiden mallinnusmenetelmÀÀa (item response theory modelling, IRT). Tutkimuksen IV aineisto kerĂ€ttiin osana sanallisten tehtĂ€vien interventioprojektia (vrt. Osatutkimukset II ja III). Tutkittavina oli 170 neljĂ€nnen ja kuudennen luokan oppilasta. Oppilaiden suoriutumista sanallisista tehtĂ€vistĂ€ arvioitiin ongelmanratkaisutestillĂ€, joka piti sisĂ€llÀÀn viisi sanallista tehtĂ€vÀÀ: yhden rutiininomaisen tehtĂ€vĂ€n, kolme ei-rutiininomaista tehtĂ€vÀÀ ja yhden soveltamista vaativan tehtĂ€vĂ€n. Osatutkimuksen IV tulokset osoittavat, ettĂ€ oppilaiden suoriutuminen sanallisista tehtĂ€vistĂ€ voidaan odotusten vastaisesti nĂ€hdĂ€ yksiulotteisena rakenteena. IRT-mallin tulokset antavat viitteitĂ€, ettĂ€ teoreettisesti vaativampi soveltamista vaativa sanallinen tehtĂ€vĂ€ on vaikeustasoltaan haastavampi kuin ei-rutiininomaiset ja rutiininomaiset tehtĂ€vĂ€t. Tulosten avulla ei kuitenkaan voitu vielĂ€ selittÀÀ, johtuiko soveltamista vaativan tehtĂ€vĂ€n (vrt. Osatutkimus IV) vaikeus siitĂ€, ettĂ€ sen ratkaiseminen edellytti realististen nĂ€kökohtien huomioimista vai mahdollisesti jotkin muut relevantit tekijĂ€t (esim. desimaalilukujen tai jakolaskujen sisĂ€ltyminen, monivaiheisempi ongelmanratkaisuprosessi). TĂ€mĂ€n lisĂ€ksi otoskoko Osatutkimuksessa IV oli suhteellisen pieni monimutkaisen testiosioiden mallinnusmenetelmĂ€n hyödyntĂ€miseen. TĂ€stĂ€ syystĂ€ Osatutkimuksessa V hyödynnettiin suurempaa otoskokoa ja laajempaa sanallisten tehtĂ€vien joukkoa, joka sisĂ€lsi monipuolisempia rutiininomaisia ja ei-rutiininomaisia tehtĂ€viĂ€. Osatutkimuksen V aineistona oli aiemmassa MerkitystĂ€ etsimĂ€ssĂ€ –projektisa koottu laaja aineisto. Tutkittavina oli 891 neljĂ€nnen luokan oppilasta (446 poikaa ja 445 tyttöÀ) suurehkoissa kaupungeissa, pikkukaupungeissa ja maaseudulla sijaitsevista alakouluista eripuolilta etelĂ€istĂ€ Suomea. Linjassa Osatutkimuksen IV tulosten kanssa, myös Osatutkimuksen V tulokset antoivat viitteitĂ€, ettĂ€ oppilaiden suoriutuminen sanallisissa tehtĂ€vissĂ€ voidaan selittÀÀ yksiulotteisella rakenteella. IRT-mallin tulokset eivĂ€t osoita selkeÀÀ eroa sanallisten tehtĂ€vien eri vaikeustasotyyppien vĂ€lillĂ€ ja hylkÀÀvĂ€t hypoteesin siitĂ€, ettĂ€ soveltamista vaativien sanallisten tehtĂ€vien vaikeustaso olisi korkeampi kuin ei-rutiininomaisten tehtĂ€vien. Jotkut ei-rutiininomaiset sanalliset tehtĂ€vĂ€t nĂ€yttivĂ€t olevan vaikeampia kuin soveltamista vaativat tehtĂ€vĂ€t, vaikka muut ominaisuudet nĂ€iden kahden erityyppisten sanallisten tehtĂ€vien vĂ€lillĂ€ olivat hyvin samankaltaiset (esim. vaativat samanlaisia laskutoimintoja ja yhtĂ€ monta vĂ€livaihetta). Viiden osatutkimuksen tulokset paljastavat, ettĂ€ vaikka matematiikan oppikirjoja pidetÀÀn yleisesti korkeatasoisina Thaimaassa ja Suomessa, suurin osa niissĂ€ olevista sanallisista tehtĂ€vistĂ€ sisĂ€ltĂ€vĂ€t yksinkertaisen tavoitteen ilman ettĂ€ ne edellyttĂ€isivĂ€t todellisen elĂ€mĂ€n tilanteiden huomioon ottamista NĂ€mĂ€ tulokset osoittavat selkeĂ€sti, ettĂ€ kouluissa tarvitaan innovatiivisempia, soveltamista vaativia matematiikan sanallisia tehtĂ€viĂ€. Tutkimuksissamme kehitimme STR-ohjelman rohkaisemaan opettajia kehittĂ€mÀÀn itse omia ei-rutiininomaisia ja soveltamista vaativia tehtĂ€viĂ€ ja hyödyntĂ€mÀÀn nĂ€itĂ€ itse kehitettyjĂ€ sanallisia tehtĂ€viĂ€ parantaakseen matemaattista mallintamista ja oppilaiden sanallisissa ongelmanratkaisutehtĂ€vissĂ€ suoriutumista. Tulokset osoittavat, ettĂ€ STR tarjoaa lupaavan lĂ€hestymistavan parantaa oppilaiden ongelmanratkaisutaitojen lisĂ€ksi myös oppilaiden uskomuksia sanallisten tehtĂ€vien ratkaisemiseen liittyen. STR:n vaikutukset olivat erilaisia riippuen oppilaiden motivaatiotasosta. STR vaikutti vain sellaisten oppilaiden uskomuksiin, joilla oli alhainen motivaatio, kun taas ohjelman vaikutukset ongelmanratkaisutehtĂ€vissĂ€ suoriutumiseen oli nĂ€htĂ€vissĂ€ ainoastaan niiden oppilaiden keskuudessa, joilla oli korkea motivaatio. NĂ€iden tulosten mukaisesti on tĂ€rkeÀÀ, ettĂ€ opettajat tarjoavat riittĂ€vĂ€sti tukea oppilaille, jotta oppilaiden itsevarmuus parantuisi ja he tuntisivat itsensĂ€ vĂ€hemmĂ€n lannistuneiksi kohdatessaan ei-rutiininomaisia ja soveltamista vaativia sanallisia tehtĂ€viĂ€. Opettajien tulisi olla tietoisia erityyppisistĂ€ sanallisista tehtĂ€vistĂ€ ja hyödyntÀÀ tĂ€tĂ€ tietoa suunnitellessaan, kuinka tukea oppilaiden sanallisten tehtĂ€vien ongelmanratkaisua tarjoamalla vaikeustasoltaan erilaisia sanallisia tehtĂ€viĂ€.Siirretty Doriast

    Separating cognitive and content domains in mathematical competence

    Full text link
    The present study investigates the empirical separability of mathematical (a) content domains, (b) cognitive domains, and (c) content-specific cognitive domains. There were 122 items representing two content domains (linear equations vs. theorem of Pythagoras) combined with two cognitive domains (modeling competence vs. technical competence) administered in a study with 1,570 German ninth graders. A unidimensional item response theory model, two two-dimensional multidimensional item response theory (MIRT) models (dimensions: content domains and cognitive domains, respectively), and a four-dimensional MIRT model (dimensions: content-specific cognitive domains) were compared with regard to model fit and latent correlations. Results indicate that the two content and the two cognitive domains can each be empirically separated. Content domains are better separable than cognitive domains. A differentiation of content-specific cognitive domains shows the best fit to the empirical data. Differential gender effects mostly confirm that the separated dimensions have different psychological meaning. Potential explanations, practical implications, and possible directions for future research are discussed. (DIPF/Orig.
    • 

    corecore