Search CORE

42 research outputs found

Item Response Theory for Peer Assessment

Author: Maomi Ueno
Masaki Uto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2016
Field of study

As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve the reliability if the model parameters can be estimated accurately. However, when applying them to actual peer assessment, the parameter estimation accuracy would be reduced for the following reasons. 1) The number of rater parameters increases with two or more times the number of raters because the models include higher-dimensional rater parameters. 2) The accuracy of parameter estimation from sparse peer assessment data depends strongly on hand-tuning parameters, called hyperparameters. To solve these problems, this article presents a proposal of a new item response model for peer assessment that incorporates rater parameters to maintain as few rater parameters as possible. Furthermore, this article presents a proposal of a parameter estimation method using a hierarchical Bayes model for the proposed model that can learn the hyperparameters from data. Finally, this article describes the effectiveness of the proposed method using results obtained from a simulation and actual data experiments

Creative Repository of Electro-Communications

評価者特性パラメータを付与した項目反応モデルに基づくパフォーマンス・テストの等化精度

Author: Masaki UTO
宇都雅輝
Publication venue: 'University of St. Thomas (Project Muse)'
Publication date: 01/06/2018
Field of study

近年，受験者の実践的かつ高次の能力を測定する手法の一つとしてパフォーマンス評価が注目されている．一方で，パフォーマンス評価の問題として，能力測定の精度が評価者とパフォーマンス課題の特性に強く依存する点が指摘されてきた．この問題を解決する手法として，近年，評価者と課題の特性を表すパラメータを付与した項目反応モデルが多数提案され，その有効性が示されている．他方，現実の評価場面では，複数回の異なるパフォーマンステストの結果を比較するニーズがしばしば生じる．このような場合に項目反応モデルを適用するためには，個々のテスト結果から推定されるモデルパラメータを同一尺度上に位置付ける「等化」が必要となる．一般に，パフォーマンステストの等化を行うためには，テスト間で課題と評価者の一部が共通するように個々のテストを設計する必要がある．このとき，等化の精度は，共通課題や共通評価者の数，各テストにおける受験者の能力特性分布，受験者数・評価者数・課題数などの様々な条件に依存すると考えられる．しかし，これまで，これらの要因が等化精度に与える影響は明らかにされておらず，テストをどのように設計すれば高精度な等化が可能となるかは示されてこなかった．そこで本研究では，項目反応モデルをパフォーマンス評価に適用して等化を行う場合に，その精度に影響を与える要因を実験により明らかにし，その結果に基づき，高い等化精度を達成するために必要なテストのデザインについて基準を示す．In various assessment contexts, performance assessment has attracted much attention to measure higher order abilities of examinees. However, a persistent difficulty is that the ability measurement accuracy depends strongly on rater and task characteristics. To resolve this problem, various item response theory (IRT) models that incorporate rater and task characteristic parameters have been proposed. On the other hand, scores obtained from a performance test is often compared to those obtained from different tests practically. For that purpose, test equating, which is the statistical process of determining comparable scores on different forms, is required. To conduct the test equating, each test must be formed to have common raters and performance tasks. In this case, accuracy of the equating depends on various settings including the number of common raters and tasks, the ability distribution assumed in each tests, the number of examinees, rater and tasks. However, no relevant studies have examined what factors affect the equating accuracy. For that reason, the study evaluates the accuracy of performance test equating based on the IRT models while changing the test design. From the result, we show the factors affecting the equating accuracy and give some designs providing high equating accuracy

Creative Repository of Electro-Communications

Empirical comparison of item response theory models with rater\u27s parameters

Author: Maomi Ueno
Masaki Uto
Publication venue: 'Elsevier BV'
Publication date: 01/05/2018
Field of study

In various assessment contexts including entrance examinations, educational assessments, and personnel appraisal, performance assessment by raters has attracted much attention to measure higher order abilities of examinees. However, a persistent difficulty is that the ability measurement accuracy depends strongly on rater and task characteristics. To resolve this shortcoming, various item response theory (IRT) models that incorporate rater and task characteristic parameters have been proposed. However, because various models with different rater and task parameters exist, it is difficult to understand each model\u27s features. Therefore, this study presents empirical comparisons of IRT models. Specifically, after reviewing and summarizing features of existing models, we compare their performance through simulation and actual data experiments

Directory of Open Access Journals

Creative Repository of Electro-Communications

情報論的アプローチに基づく論文構成構築支援システム

Author: Masaki Uto
宇都雅輝
Publication venue
Publication date: 21/09/2016
Field of study

電気通信大学201

Creative Repository of Electro-Communications

ピアアセスメントにおける異質評価者に頑健な項目反応理論

Author: Maomi UENO
Masaki UTO
宇都雅輝
植野真臣
Publication venue: 'University of St. Thomas (Project Muse)'
Publication date: 01/01/2018
Field of study

近年，MOOCsに代表される大規模eラーニングの普及に伴い，ピアアセスメントを学習者の能力測定に用いるニーズが高まっている．一方で，ピアアセスメントによる能力測定の課題として，その測定精度が評価者の特性に強く依存する問題が指摘されてきた．この問題を解決する手法の一つとして，評価者特性パラメータを付与した項目反応モデルが近年多数提案されている．しかし，既存モデルでは，評価基準が他の評価者と極端に異なる“異質評価者”の特性を必ずしも表現できないため，異質評価者が存在する可能性があるピアアセスメントに適用したとき能力測定精度が低下する問題が残る．この問題を解決するために，本論文では，1）評価の厳しさ，2）一貫性，3）尺度範囲の制限，に対応する評価者特性パラメータを付与した新たな項目反応モデルを提案する．提案モデルの利点は次のとおりである．1）評価者の特性を柔軟に表現できるため，異質評価者の採点データに対するモデルのあてはまりを改善できる．2）異質評価者の影響を正確に能力測定値に反映できるため，異質評価者が存在するピアアセスメントにおいて，既存モデルより高精度な能力測定が期待できる．本論文では，シミュレーション実験と実データ実験から提案モデルの有効性を示す．Item response theory (IRT) model that incorporates rater characteristic parameters have recently been proposed to improve peer assessment accuracy. However, the assessment accuracy based on the models will be reduced when the number of aberrant raters increases because they can necessarily not capture those characteristics. To resolve the problem, we propose a new IRT model that incorporates rater characteristic parameters corresponding to severity, consistency, and range restriction. The proposed model has the following advantages. 1) The model fitting to aberrant raters\u27 data is expected to be improved because the proposed model can represent rater characteristics flexibly. 2) Peer assessment accuracy is expected to be improved even when aberrant raters exist because learner ability can be estimated as to reflect aberrant raters\u27 characteristics more accurately. Through simulation and actual data experiments, we demonstrate effectiveness of the proposed model

Creative Repository of Electro-Communications

Group optimization to maximize peer assessment accuracy using item response theory and integer programming

Author: Duc-Thien Nguyen
Maomi Ueno
Masaki Uto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

With the wide spread of large-scale e-learning environments such as MOOCs, peer assessment has been popularly used to measure learner ability. When the number of learners increases, peer assessment is often conducted by dividing learners into multiple groups to reduce the learner\u27s assessment workload. However, in such cases, the peer assessment accuracy depends on the method of forming groups. To resolve that difficulty, this study proposes a group formation method to maximize peer assessment accuracy using item response theory and integer programming. Experimental results, however, have demonstrated that the proposed method does not present sufficiently higher accuracy than a random group formation method does. Therefore, this study further proposes an external rater assignment method that assigns a few outside-group raters to each learner after groups are formed using the proposed group formation method. Through results of simulation and actual data experiments, this study demonstrates that the proposed external rater assignment can substantially improve peer assessment accuracy

Creative Repository of Electro-Communications

A generalized many-facet Rasch model and its Bayesian estimation using Hamiltonian Monte Carlo

Author: Maomi Ueno
Masaki Uto
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2020
Field of study

Performance assessments, in which raters assess examinee performance for given tasks, have a persistent difficulty in that ability measurement accuracy depends on rater characteristics. To address this problem, various item response theory (IRT) models that incorporate rater characteristic parameters have been proposed. Conventional models partially consider three typical rater characteristics: severity, consistency, and range restriction. Each are important to improve model fitting and ability measurement accuracy, especially when the diversity of raters increases. However, no models capable of simultaneously representing each have been proposed. One obstacle for developing such a complex model is the difficulty of parameter estimation. Maximum likelihood estimation, which is used in most conventional models, generally leads to unstable and inaccurate parameter estimations in complex models. Bayesian estimation is expected to provide more robust estimations. Although it incurs high computational costs, recent increases in computational capabilities and the development of efficient Markov chain Monte Carlo (MCMC) algorithms make its use feasible. We thus propose a new IRT model that can represent all three typical rater characteristics. The model is formulated as a generalization of the many-facet Rasch model. We also develop a Bayesian estimation method for the proposed model using No-U-Turn Hamiltonian Monte Carlo, a state-of-the-art MCMC algorithm. We demonstrate the effectiveness of the proposed method through simulation and actual data experiments

Creative Repository of Electro-Communications

Bayes factorを用いたRAIアルゴリズムによる大規模ベイジアンネットワーク学習

Author: Kazuki NATORI
Maomi UENO
Masaki UTO
名取和樹
宇都雅輝
植野真臣
Publication venue: 'University of St. Thomas (Project Muse)'
Publication date: 01/05/2018
Field of study

漸近一致性をもつベイジアンネットワークの構造学習はNP困難である．これまで動的計画法やA*探索，整数計画法による探索アルゴリズムが開発されてきたが，未だに60ノード程度の構造学習を限界とし，大規模構造学習の実現のためには，全く異なるアプローチの開発が急務である．一方で因果モデルの研究分野では，条件付き独立性テスト（CIテスト）と方向付けによる画期的に計算量を削減した構造学習アプローチが提案されている．このアプローチは制約ベースアプローチと呼ばれ，RAIアルゴリズムが最も高精度な最先端学習法として知られている．しかしRAIアルゴリズムは，CIテストに仮説検定法または条件付き相互情報量を用いている．前者の精度は帰無仮説が正しい確率を表すp値とユーザが設定する有意水準に依存する．p値はデータ数の増加により小さい値を取り，誤って帰無仮説を棄却してしまう問題が知られている．一方で，後者の精度はしきい値の設定に強く影響する．したがって，漸近的に真の構造を学習できる保証がない．本論文では，漸近一致性を有するBayes factorを用いたCIテストをRAIアルゴリズムに組み込む．これにより，数百ノードをもつ大規模構造学習を実現する．数種類のベンチマークネットワークを用いたシミュレーション実験により，本手法の有意性を示す．A score-based learning Bayesian networks is NP-hard. On the other hands, constraint-based approach, that can dynamically relaxes the computational cost, is applicable to learning huge Bayesian network structures. The approach uses conditional independence (CI) tests based on the conditional mutual information and statistical testings. However, those CI tests have no consistency. In this paper, we propose a new constraint-based learning method that uses the CI test based on the Bayes factor, which have consistency. The proposed method combines it to the RAI algorithm, that is a state-of-the-art algorithm of the constraint-based approach. The experimental result shows our proposed method provides empirically best performance

Creative Repository of Electro-Communications

ピアアセスメントにおける項目反応理論を用いたグループ構成最適化

Author: Maomi UENO
Masaki UTO
Thien Duc NGUYEN
グエンドク　ティエン
宇都雅輝
植野真臣
Publication venue: 'University of St. Thomas (Project Muse)'
Publication date: 01/02/2018
Field of study

近年，社会構成主義に基づく学習評価法としてピアアセスメントが注目されている．一般に，MOOCsのように学習者数が多い場合のピアアセスメントは，評価の負担を軽減するために学習者を複数のグループに分割してグループ内のメンバ同士で行うことが多い．しかし，この場合，学習者の能力測定精度がグループ構成の仕方に依存する問題が残る．この問題を解決するために，本研究では，項目反応理論を用いて，学習者の能力測定精度を最大化するようにグループを構成する手法を提案する．しかし，実験の結果，ランダムにグループを構成した場合と比べ，提案手法が必ずしも高い能力測定精度を示すとは限らないことが明らかとなった．そこで，本研究では，グループ内の学習者同士でのみ評価を行うという制約を緩和し，各学習者に対して少数のグループ外評価者を割り当てる外部評価者選択手法を提案する．シミュレーションと被験者実験から，提案手法を用いて数名の外部評価者を追加することで，グループ内の学習者のみによる評価に比べ，能力測定精度が改善されることが確認された．As an assessment method based on social constructivism, peer assessment has attracted much attention in recent years. When learners increase as in MOOCs, peer assessment is often conducted by dividing learners into groups. However, in this case, the accuracy of peer assessment depends on a way of forming groups. To optimize the accuracy, this study develops a group optimization method using item response theory. However, experimental results show that the method cannot sufficiently improve the accuracy compared to random groups. Therefore, the study further proposes an external rater selection method to assign a few appropriate outside-group raters to each learner. Experimental results demonstrate that the proposed method can sufficiently improve the accuracy

Creative Repository of Electro-Communications

ダイナミックアセスメントのための隠れマルコフIRTモデル

Author: Emiko TSUTSUMI
Maomi UENO
Masaki UTO
堤瑛美子
宇都雅輝
植野真臣
Publication venue: 'University of St. Thomas (Project Muse)'
Publication date: 01/02/2018
Field of study

教育の最も難しい問題は，教師は学習者に教えすぎても，教えなさすぎても学習者の十分な発達は望めないということである．そのために，教師は個々の学習者の理解度や最適な支援の度合いを予測することが重要な課題となっている．足場がけによる学習者のパフォーマンスを予測するために，項目反応理論を用いて最適な予測正答確率になるようにヒントを提示する足場がけシステムが開発されている．しかし，従来の項目反応理論では，学習者の能力変化がモデルに考慮されておらず，正確な正答確率を予測できないために，最適なヒント数を予測できていない可能性がある．本研究では，学習者の能力が時間変化していくプロセスを項目反応理論に組み込み，能力が隠れマルコフ過程に従って変化すると仮定した新しい項目反応モデルを提案する．提案モデルでは，能力値が継続する時間(課題数) を表すウィンドウサイズと能力の変動の程度を反映する変動パラメータをもち，これらの最適値がデータから推定されるために，学習者の真の能力変化を反映でき，予測精度を向上させることが期待される．実データを用いて，本提案の有効性を示す．To scaffold a learner efficiently, a teacher should predict the optimal degree of assistance to support learner\u27s development. However, conventional Item Response Theory (IRT) model does not consider learner\u27s ability changes during his/her studying, therefore the IRT model might cause over-assistance or lack of assistance. We propose a new IRT model that incorporates learner\u27s ability change according to a Hidden Markov process. The proposed model has the following two new parameters: the degree of the ability changes and the period of time that the learner\u27s ability does not change. The experiments result shows that the proposed model improves the prediction accuracy of learner\u27s performances

Creative Repository of Electro-Communications