Search CORE

4 research outputs found

Measurement Integrity in Peer Prediction: A Peer Assessment Case Study

Author: Burrell Noah
Schoenebeck Grant
Publication venue
Publication date: 22/09/2022
Field of study

We propose measurement integrity, a property related to ex post reward fairness, as a novel desideratum for peer prediction mechanisms in many natural applications. Like robustness against strategic reporting, the property that has been the primary focus of the peer prediction literature, measurement integrity is an important consideration for understanding the practical performance of peer prediction mechanisms. We perform computational experiments, both with an agent-based model and with real data, to empirically evaluate peer prediction mechanisms according to both of these important properties. Our evaluations simulate the application of peer prediction mechanisms to peer assessment -- a setting in which ex post fairness concerns are particularly salient. We find that peer prediction mechanisms, as proposed in the literature, largely fail to demonstrate significant measurement integrity in our experiments. We also find that theoretical properties concerning robustness against strategic reporting are somewhat noisy predictors of empirical performance. Further, there is an apparent trade-off between our two dimensions of analysis. The best-performing mechanisms in terms of measurement integrity are highly susceptible to strategic reporting. Ultimately, however, we show that supplementing mechanisms with realistic parametric statistical models can, in some cases, improve performance along both dimensions of our analysis and result in mechanisms that strike the best balance between them.Comment: The code for our experiments is hosted in the following GitHub repository: https://github.com/burrelln/Measurement-Integrity-and-Peer-Assessment. Version 2 (uploaded on 9/22/22) introduces experiments with real peer grading data alongside significant changes to the framing of the paper and presentation of the result

arXiv.org e-Print Archive

Model konsep integriti ke arah peningkatan kualiti penilaian rakan (peer assessment)

Author: Zainal Abidin Noor Atikah
Publication venue
Publication date: 01/08/2019
Field of study

mereka dalam sesuatu tugasan. Namun, penilaian rakan ini kurang diamalkan di institusi pengajian tinggi kerana kualiti penilaian ini masih diragui terutamanya dari aspek integriti pelajar sebagai penilai. Sehubungan itu, kajian ini mencadangkan satu model integriti ke arah peningkatan kualiti penilaian rakan. Kajian ini dijalankan menggunakan reka bentuk multiphase yang terdiri daripada tiga (3) fasa. Pada fasa I (Analisis dokumen dan temu bual), dokumen daripada tahun 2010 hingga 2018 telah digunakan dan temu bual daripada enam (6) pakar dalam bidang Pendidikan Teknikal dan Vokasional (PTV) telah memperoleh tiga (3) elemen integriti iaitu, Integriti Diri (Motivasi diri, keberanian, disiplin diri, dan ketelusan), Interaksi Sosial (Kejujuran, keadilan, konsisten, amanah, dan perpaduan), dan Komitmen Kerja (Usaha, tanggungjawab, dan etika). Bagi fasa II (Pembangunan instrumen) pula, teknik Modified Delphi (MD) digunakan bagi memperoleh konsensus daripada pakar mengenai item-item yang dibina. Hasil daripada persetujuan pakar MD terdapat 90 item digunakan dalam kajian rintis I dan dianalisis menggunakan Winsteps; dan 19 item telah disingkirkan. Seterusnya, Kajian rintis II dijalankan bagi tujuan mendapatkan kesahan dan kebolehpercayaan item yang digunakan menggunakan Exploratory Factor Analysis (EFA). Hasil analisis EFA mendapati 3 item bertindih dan disingkirkan bagi menjalankan fasa akhir. Pada Fasa III (Pembangunan model), sebanyak 543 soal selidik telah dianalisis dengan menggunakan Structural Equation Modelling (SEM-AMOS). Hasil Miximum Likelihood Estimates menunjukkan nilai C.R > ± 1.96 bagi pekali regresi antara integriti dan kualiti penilaian adalah positif dan signifikan (β = 0.85, C.R = 12.558, p < 0.001). Ini menggambarkan bahawa integriti mempengaruhi kualiti penilaian. Manakala, kesediaan memperuntukkan masa merupakan partial mediator (rc= 0.23) yang sederhana penting kepada integriti dan kualiti penilaian rakan secara tidak langsung. Namun, jantina merupakan full moderator kepada kesan integriti terhadap kualiti penilaian rakan kerana pelajar perempuan mendapatkan nilai Δχ2 = 16.754 lebih besar daripada 3.84 berbanding dengan lelaki iaitu nilai Δχ2 = 3.218. Model integriti boleh dijadikan sebagai satu model panduan yang digunakan oleh pensyarah bagi mengaplikasikan penilaian rakan secara sistematik dan meramalkan kemungkinan yang akan berlaku semasa aktiviti penilaian ini dijalankan

UTHM Institutional Repository

On Connections Between Machine Learning And Information Elicitation, Choice Modeling, And Theoretical Computer Science

Author: Agarwal Arpit
Publication venue: ScholarlyCommons
Publication date: 01/01/2021
Field of study

Machine learning, which has its origins at the intersection of computer science and statistics, is now a rapidly growing area of research that is being integrated into almost every discipline in science and business such as economics, marketing and information retrieval. As a consequence of this integration, it is necessary to understand how machine learning interacts with these disciplines and to understand fundamental questions that arise at the resulting interfaces. The goal of my thesis research is to study these interdisciplinary questions at the interface of machine learning and other disciplines including mechanism design/information elicitation, preference/choice modeling, and theoretical computer science

ScholarlyCommons@Penn

Recommended from our members

Practical Peer Prediction for Peer Assessment

Author: Parkes David C.
Shnayder Victor
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/01/2016
Field of study

We provide an empirical analysis of peer prediction mechanisms, which reward participants for information in settings when there is no ground truth against which to score reports. We simulate the mechanisms on a dataset of three million peer assessments from the edX MOOC platform. We evaluate different mechanisms on score variability, which is connected to fairness, risk aversion, and participant learning. We also assess the magnitude of the incentives to invest effort, and study the effect of participant coordination on low-information signals. We find that the correlated agreement mechanism has lower variation in reward than other mechanisms. A concern is that the gain from exerting effort is relatively low across all mechanisms, due to frequent disagreement between peers. Our conclusions are relevant for crowdsourcing in education as well as other domains.Other Research Uni

Harvard University - DASH

Association for the Advancement of Artificial Intelligence: AAAI Publications