51 research outputs found
Bias at a Second Glance: A Deep Dive into Bias for German Educational Peer-Review Data Modeling
Natural Language Processing (NLP) has become increasingly utilized to provide
adaptivity in educational applications. However, recent research has
highlighted a variety of biases in pre-trained language models. While existing
studies investigate bias in different domains, they are limited in addressing
fine-grained analysis on educational and multilingual corpora. In this work, we
analyze bias across text and through multiple architectures on a corpus of
9,165 German peer-reviews collected from university students over five years.
Notably, our corpus includes labels such as helpfulness, quality, and critical
aspect ratings from the peer-review recipient as well as demographic
attributes. We conduct a Word Embedding Association Test (WEAT) analysis on (1)
our collected corpus in connection with the clustered labels, (2) the most
common pre-trained German language models (T5, BERT, and GPT-2) and GloVe
embeddings, and (3) the language models after fine-tuning on our collected
data-set. In contrast to our initial expectations, we found that our collected
corpus does not reveal many biases in the co-occurrence analysis or in the
GloVe embeddings. However, the pre-trained German language models find
substantial conceptual, racial, and gender bias and have significant changes in
bias across conceptual and racial axes during fine-tuning on the peer-review
data. With our research, we aim to contribute to the fourth UN sustainability
goal (quality education) with a novel dataset, an understanding of biases in
natural language education data, and the potential harms of not counteracting
biases in language models for educational tasks.Comment: Accepted as a full paper at COLING 2022: The 29th International
Conference on Computational Linguistics, 12-17 of October 2022, Gyeongju,
Republic of Kore
Trusting the Explainers: Teacher Validation of Explainable Artificial Intelligence for Course Design
Deep learning models for learning analytics have become increasingly popular
over the last few years; however, these approaches are still not widely adopted
in real-world settings, likely due to a lack of trust and transparency. In this
paper, we tackle this issue by implementing explainable AI methods for
black-box neural networks. This work focuses on the context of online and
blended learning and the use case of student success prediction models. We use
a pairwise study design, enabling us to investigate controlled differences
between pairs of courses. Our analyses cover five course pairs that differ in
one educationally relevant aspect and two popular instance-based explainable AI
methods (LIME and SHAP). We quantitatively compare the distances between the
explanations across courses and methods. We then validate the explanations of
LIME and SHAP with 26 semi-structured interviews of university-level educators
regarding which features they believe contribute most to student success, which
explanations they trust most, and how they could transform these insights into
actionable course design decisions. Our results show that quantitatively,
explainers significantly disagree with each other about what is important, and
qualitatively, experts themselves do not agree on which explanations are most
trustworthy. All code, extended results, and the interview protocol are
provided at https://github.com/epfl-ml4ed/trusting-explainers.Comment: Accepted as a full paper at LAK 2023: The 13th International Learning
Analytics and Knowledge Conference, March 13-17, 2023, Arlington, Texas, US
Recommended from our members
Designing Intelligent Systems for Online Education: Open Challenges and Future Directions
The design and delivering of platforms for online education is fostering increasingly intense research. Scaling up education online brings new emerging needs related with hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely, as examples. However, with the impressive progress of the data mining and machine learning fields, combined with the large amounts of learning-related data and high-performance computing, it has been possible to gain a deeper understanding of the nature of learning and teaching online. Methods at the analytical and algorithmic levels are constantly being developed and hybrid approaches are receiving an increasing attention. Recent methods are analyzing not only the online traces left by students a posteriori, but also the extent to which this data can be turned into actionable insights and models, to support the above needs in a computationally efficient, adaptive and timely way. In this paper, we present relevant open challenges lying at the intersection between the machine learning and educational communities, that need to be addressed to further develop the field of intelligent systems for online education. Several areas of research in this field are identified, such as data availability and sharing, time-wise and multi-modal data modelling, generalizability, fairness, explainability, interpretability, privacy, and ethics behind models delivered for supporting education. Practical challenges and recommendations for possible research directions are provided for each of them, paving the way for future advances in this field
Designing Intelligent Systems for Online Education: Open Challenges and Future Directions
The design and delivering of platforms for online education is fostering increasingly intense research. Scaling up education online brings new emerging needs related with hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely, as examples. However, with the impressive progress of the data mining and machine learning fields, combined with the large amounts of learning-related data and high-performance computing, it has been possible to gain a deeper understanding of the nature of learning and teaching online. Methods at the analytical and algorithmic levels are constantly being developed and hybrid approaches are receiving an increasing attention. Recent methods are analyzing not only the online traces left by students a posteriori, but also the extent to which this data can be turned into actionable insights and models, to support the above needs in a computationally efficient, adaptive and timely way. In this paper, we present relevant open challenges lying at the intersection between the machine learning and educational communities, that need to be addressed to further develop the field of intelligent systems for online education. Several areas of research in this field are identified, such as data availability and sharing, time-wise and multi-modal data modelling, generalizability, fairness, explainability, interpretability, privacy, and ethics behind models delivered for supporting education. Practical challenges and recommendations for possible research
directions are provided for each of them, paving the way for future advances in this field
Ripple: Concept-Based Interpretation for Raw Time Series Models in Education
Time series is the most prevalent form of input data for educational
prediction tasks. The vast majority of research using time series data focuses
on hand-crafted features, designed by experts for predictive performance and
interpretability. However, extracting these features is labor-intensive for
humans and computers. In this paper, we propose an approach that utilizes
irregular multivariate time series modeling with graph neural networks to
achieve comparable or better accuracy with raw time series clickstreams in
comparison to hand-crafted features. Furthermore, we extend concept activation
vectors for interpretability in raw time series models. We analyze these
advances in the education domain, addressing the task of early student
performance prediction for downstream targeted interventions and instructional
support. Our experimental analysis on 23 MOOCs with millions of combined
interactions over six behavioral dimensions show that models designed with our
approach can (i) beat state-of-the-art educational time series baselines with
no feature extraction and (ii) provide interpretable insights for personalized
interventions. Source code: https://github.com/epfl-ml4ed/ripple/.Comment: Accepted as a full paper at AAAI 2023: 37th AAAI Conference on
Artificial Intelligence (EAAI: AI for Education Special Track), 7-14 of
February 2023, Washington DC, US
Generative AI for Education (GAIED): Advances, Opportunities, and Challenges
This survey article has grown out of the GAIED (pronounced "guide") workshop
organized by the authors at the NeurIPS 2023 conference. We organized the GAIED
workshop as part of a community-building effort to bring together researchers,
educators, and practitioners to explore the potential of generative AI for
enhancing education. This article aims to provide an overview of the workshop
activities and highlight several future research directions in the area of
GAIED
MultiModN- Multimodal, Multi-Task, Interpretable Modular Networks
Predicting multiple real-world tasks in a single model often requires a
particularly diverse feature space. Multimodal (MM) models aim to extract the
synergistic predictive potential of multiple data types to create a shared
feature space with aligned semantic meaning across inputs of drastically
varying sizes (i.e. images, text, sound). Most current MM architectures fuse
these representations in parallel, which not only limits their interpretability
but also creates a dependency on modality availability. We present MultiModN, a
multimodal, modular network that fuses latent representations in a sequence of
any number, combination, or type of modality while providing granular real-time
predictive feedback on any number or combination of predictive tasks.
MultiModN's composable pipeline is interpretable-by-design, as well as innately
multi-task and robust to the fundamental issue of biased missingness. We
perform four experiments on several benchmark MM datasets across 10 real-world
tasks (predicting medical diagnoses, academic performance, and weather), and
show that MultiModN's sequential MM fusion does not compromise performance
compared with a baseline of parallel fusion. By simulating the challenging bias
of missing not-at-random (MNAR), this work shows that, contrary to MultiModN,
parallel fusion baselines erroneously learn MNAR and suffer catastrophic
failure when faced with different patterns of MNAR at inference. To the best of
our knowledge, this is the first inherently MNAR-resistant approach to MM
modeling. In conclusion, MultiModN provides granular insights, robustness, and
flexibility without compromising performance.Comment: Accepted as a full paper at NeurIPS 2023 in New Orleans, US
Enhancing Procedural Writing Through Personalized Example Retrieval: A Case Study on Cooking Recipes
Writing high-quality procedural texts is a challenging task for many learners. While example-based learning has shown promise as a feedback approach, a limitation arises when all learners receive the same content without considering their individual input or prior knowledge. Consequently, some learners struggle to grasp or relate to the feedback, finding it redundant and unhelpful. To address this issue, we present RELEX, an adaptive learning system designed to enhance procedural writing through personalized example-based learning. The core of our system is a multi-step example retrieval pipeline that selects a higher quality and contextually relevant example for each learner based on their unique input. We instantiate our system in the domain of cooking recipes. Specifically, we leverage a fine-tuned Large Language Model to predict the quality score of the learner’s cooking recipe. Using this score, we retrieve recipes with higher quality from a vast database of over 180,000 recipes. Next, we apply BM25 to select the semantically most similar recipe in real-time. Finally, we use domain knowledge and regular expressions to enrich the selected example recipe with personalized instructional explanations. We evaluate RELEX in a 2x2 controlled study (personalized vs. non-personalized examples, reflective prompts vs. none) with 200 participants. Our results show that providing tailored examples contributes to better writing performance and user experience
Kinder mit Dyskalkulie fokussieren spontan weniger auf Anzahligkeit
Abstract in English
Extended abstractChildren with Developmental Dyscalculia Focus Spontaneously Less on NumerositiesChildren differ in how much they spontaneously pay attention to quantitative aspects in their surroundings. The tendency to Spontaneously Focus On Numerosity (SFON) can be quantified and provides a stable and sensitive measure of using exact enumeration (Hannula & Lehtinen, 2005; Hannula, Lepola & Lehtinen, 2010; Hannula, Mattinen & Lehtinen, 2005; Hannula, Räsänen & Lehtinen, 2007). Moreover, SFON-behaviour is positively related to counting and mathematical abilities (Hannula & Lehtinen, 2005; Hannula et al., 2007). Children who focus more on numbers show better performance in numerical tasks. In addition, the amount of SFON seems to develop consistently over time. Therefore, SFON can be used as a predictor of future numerical development (Hannula et al., 2010).In children with developmental dyscalculia (DD), the acquisition of numerical abilities is specifically impaired. These children have problems in basic numerical skills, like counting or the fast and accurate enumeration of small numerosities (subitizing), the understanding of cardinal and ordinal principles, as well as in higher mathematical skills, as arithmetic [detailed information about DD can be found e. g. in (Landerl & Kaufmann 2008; Vogel & Ansari 2012; von Aster & Lorenz 2005)]. About 3 – 6 % of school-children are affected by this learning disability (Reigosa-Crespo et al., 2012; Shalev, Auerbach, Manor & Gross-Tsur, 2000; Shalev & von Aster, 2008; von Aster, Schweiter & Weinhold Zulauf, 2007). In the present study, we have addressed the question whether children with DD differ in their spontaneous tendency to pay attention to exact numerosities.Besides of SFON, a variety of cognitive skills were examined in 76 children between 7 and 11 years of age; half of them were diagnosed with DD. Children with DD and control children were carefully matched for general cognitive abilities, but differed significantly in number-related measures.Results indicated significantly weaker SFON tendency in children with DD, which means that these children pay less attention on the aspect of exact numerosity compared to typically achieving children. Furthermore, the amount of SFON was positively related to number processing. Children who focus spontaneously less on exact quantities performed lower in numerical tasks.Our results indicate that a low SFON tendency depicts a behavioural characteristic of dyscalculia. Why SFON is diminished in DD can have several reasons. Children with DD might neglect or avoid numerical contents in their learning environment, e. g. as a result of receptive deficits, lack of opportunity or appropriate alimentation or as a result of negative learning experiences. As a consequence, they acquire less practice and expertise in mathematical activities which in turn could have negative effects on the development of automated SFON processes. On the other hand, Hannula et al. (2010) speculate that an initial reduction in SFON behaviour during early learning phases might be associated with children’s lower tendency to focus on mathematical aspects. Accordingly, a diminished SFON tendency in children with DD could additionally increase their numerical learning difficulties.The amount of focusing on quantities is related to counting skills (Hannula & Lehtinen, 2005; Hannula et al., 2007). Children with DD dwell longer on less experienced counting strategies and show difficulties in subitizing which is connected to lower math performance (Clements & Sarama, 2009; Frank 1989; Geary, Hoard & Hamson, 1999; Jordan, David Kaplan, Locuniak & Ramineni, 2007;Landerl, Bevan & Butterworth, 2004; Schleifer & Landerl, 2011). Such immature counting skills might lead to a reduced SFON tendency in children with DD. However, it might also be possible that deficits in SFON processes are accompanied by problems in the development of higher counting strategies.In summary, the present study showed for the first time that children with DD focus their attention less on quantitative aspects in their natural surrounding. Whether the reduced SFON tendency influences the development of counting and calculation abilities in a negative way or whether a deficit in basic number processing due to dyscalculia results in a diminished SFON amount is open. However, lower SFON behaviour delineates an additional characteristic of developmental dyscalculia and earns special interest since SFON is a stable and sensitive measure of further learning success. SFON tendency might be accounted as an early predictor of dyscalculia risk on the grounds that it can already be assessed in 3.5 year old children. Finally, the encouragement to focus on numerical aspects by adequate learning environments can enhance SFON tendency which positively affects the development of mathematical skills in children (Hannula et al., 2005). Hence, support in the development of SFON behaviour seems also advisable for children with dyscalculia.
Abstract in German
Zusammenfassung: Wie stark wir spontan auf Anzahligkeit in unserer Umgebung achten wird als SFON (Spontaneous Focussing On Numerosity) bezeichnet. Frühere Studien haben gezeigt, dass ein Kind, das stärkere SFON-Tendenz zeigt, bessere Zählfertigkeiten und mathematische Leistungen erbringt. SFON scheint sich stabil und kontinuierlich zu entwickeln und kann als Prädiktor für die zukünftige Rechenleistung genutzt werden. Es wird dementsprechend als ein stabiles und sensibles Maß für die numerische Entwicklung beschrieben. Bei Kindern mit Dyskalkulie scheint die Entwicklung der Zahlenverarbeitung und des Rechnens spezifisch gestört. Das Ziel der vorliegenden Studie ist die Untersuchung der SFON-Tendenz bei Kindern mit einer entwicklungsbedingten Dyskalkulie. Wir haben SFON bei 76 Kindern zwischen 7 und 11 Jahren getestet, 38 Kinder mit und 38 ohne Dyskalkulie. Die beiden Gruppen zeigten vergleichbare allgemeine kognitive Fähigkeiten, unterschieden sich aber spezifisch in den mathematischen Leistungen. Die Ergebnisse zeigen eine signifikant schwächere SFON-Tendenz bei Kindern mit Dyskalkulie, das heißt, Kinder mit Dyskalkulie fokussieren im Vergleich zu Kontrollkindern spontan weniger häufig auf Anzahligkeit. Zudem korreliert SFON positiv mit der Zahlenverarbeitungs- und Rechenleistung. Das heißt, Kinder mit schlechteren mathematischen Fertigkeiten achten spontan weniger auf numerische Aspekte. Die Ergebnisse zeigen, dass eine verminderte SFON-Tendenz ein Verhaltensmerkmal für Entwicklungsdyskalkulie zu sein scheint. Dies kann sowohl Ursache als auch Folge der Störung von Zähl- und Rechenfertigkeiten sein. Es empfiehlt sich daher, SFON bei Kindern mit einem Dyskalkulierisiko zu erfassen sowie Förderung und Lernumgebung in Hinblick auf Anzahlfokussierung anzureichern.
</div
- …