10,037 research outputs found
Training Noise-Robust Spoken Phrase Detectors with Scarce and Private Data: An Application to Classroom Observation Videos
We explore how to automatically detect specific phrases in audio from noisy, multi-speaker videos using deep neural networks. Specifically, we focus on classroom observation videos that contain a few adult teachers and several small children (\u3c 5 years old). At any point in these videos, multiple people may be talking, shouting, crying, or singing simultaneously. Our goal is to recognize polite speech phrases such as Good job , Thank you , Please , and You\u27re welcome , as the occurrence of such speech is one of the behavioral markers used in classroom observation coding via the Classroom Assessment Scoring System (CLASS) protocol. Commercial speech recognition services such as Google Cloud Speech are impractical because of data privacy concerns. Therefore, we train and test our own custom models using a combination of publicly available classroom videos from YouTube, as well as a private dataset of real classroom observation videos collected by our colleagues at the University of Virginia. We also crowdsource an additional 1152 recordings of polite speech phrases to augment our training dataset. Our contributions are the following: (1) we design a crowdsourcing task for efficiently labeling speech events in classroom videos, (2) we develop a neural network-based architecture for speech recognition, robust to noise and overlapping speech, and (3) we explore methods to synthesize new and authentic audio data, both to increase the training set size and reduce the class imbalance. Finally, using our trained polite speech detector, (4) we investigate the relationship between polite speech and CLASS scores and enable teachers to visualize their use of polite language
Group-Level Emotion Recognition Using a Unimodal Privacy-Safe Non-Individual Approach
This article presents our unimodal privacy-safe and non-individual proposal
for the audio-video group emotion recognition subtask at the Emotion
Recognition in the Wild (EmotiW) Challenge 2020 1. This sub challenge aims to
classify in the wild videos into three categories: Positive, Neutral and
Negative. Recent deep learning models have shown tremendous advances in
analyzing interactions between people, predicting human behavior and affective
evaluation. Nonetheless, their performance comes from individual-based
analysis, which means summing up and averaging scores from individual
detections, which inevitably leads to some privacy issues. In this research, we
investigated a frugal approach towards a model able to capture the global moods
from the whole image without using face or pose detection, or any
individual-based feature as input. The proposed methodology mixes
state-of-the-art and dedicated synthetic corpora as training sources. With an
in-depth exploration of neural network architectures for group-level emotion
recognition, we built a VGG-based model achieving 59.13% accuracy on the VGAF
test set (eleventh place of the challenge). Given that the analysis is unimodal
based only on global features and that the performance is evaluated on a
real-world dataset, these results are promising and let us envision extending
this model to multimodality for classroom ambiance evaluation, our final target
application
Use of automated coding methods to assess motivational behaviour in education
Teachersâ motivational behaviour is related to important student outcomes. Assessing teachersâ motivational behaviour has been helpful to improve teaching quality and enhance student outcomes. However, researchers in educational psychology have relied on self-report or observer ratings. These methods face limitations on accurately and reliably assessing teachersâ motivational behaviour; thus restricting the pace and scale of conducting research. One potential method to overcome these restrictions is automated coding methods. These methods are capable of analysing behaviour at a large scale with less time and at low costs. In this thesis, I conducted three studies to examine the applications of an automated coding method to assess teacher motivational behaviours. First, I systematically reviewed the applications of automated coding methods used to analyse helping professionalsâ interpersonal interactions using their verbal behaviour. The findings showed that automated coding methods were used in psychotherapy to predict the codes of a well-developed behavioural coding measure, in medical settings to predict conversation patterns or topics, and in education to predict simple concepts, such as the number of open/closed questions or class activity type (e.g., group work or teacher lecturing). In certain circumstances, these models achieved near human level performance. However, few studies adhered to best-practice machine learning guidelines. Second, I developed a dictionary of teachersâ motivational phrases and used it to automatically assess teachersâ motivating and de-motivating behaviours. Results showed that the dictionary ratings of teacher need support achieved a strong correlation with observer ratings of need support (rfull dictionary = .73). Third, I developed a classification of teachersâ motivational behaviour that would enable more advanced automated coding of teacher behaviours at each utterance level. In this study, I created a classification that includes 57 teacher motivating and de-motivating behaviours that are consistent with self-determination theory. Automatically assessing teachersâ motivational behaviour with automatic coding methods can provide accurate, fast pace, and large scale analysis of teacher motivational behaviour. This could allow for immediate feedback and also development of theoretical frameworks. The findings in this thesis can lead to the improvement of student motivation and other consequent student outcomes
Eye on Collaborative Creativity : Insights From Multiple-Person Mobile Gaze Tracking in the Context of Collaborative Design
Early Career WorkshopNon peer reviewe
The NCTE Transcripts: A Dataset of Elementary Math Classroom Transcripts
Classroom discourse is a core medium of instruction -- analyzing it can
provide a window into teaching and learning as well as driving the development
of new tools for improving instruction. We introduce the largest dataset of
mathematics classroom transcripts available to researchers, and demonstrate how
this data can help improve instruction. The dataset consists of 1,660 45-60
minute long 4th and 5th grade elementary mathematics observations collected by
the National Center for Teacher Effectiveness (NCTE) between 2010-2013. The
anonymized transcripts represent data from 317 teachers across 4 school
districts that serve largely historically marginalized students. The
transcripts come with rich metadata, including turn-level annotations for
dialogic discourse moves, classroom observation scores, demographic
information, survey responses and student test scores. We demonstrate that our
natural language processing model, trained on our turn-level annotations, can
learn to identify dialogic discourse moves and these moves are correlated with
better classroom observation scores and learning outcomes. This dataset opens
up several possibilities for researchers, educators and policymakers to learn
about and improve K-12 instruction. The data and its terms of use can be
accessed here: https://github.com/ddemszky/classroom-transcript-analysi
- âŠ