10,037 research outputs found

    Training Noise-Robust Spoken Phrase Detectors with Scarce and Private Data: An Application to Classroom Observation Videos

    Get PDF
    We explore how to automatically detect specific phrases in audio from noisy, multi-speaker videos using deep neural networks. Specifically, we focus on classroom observation videos that contain a few adult teachers and several small children (\u3c 5 years old). At any point in these videos, multiple people may be talking, shouting, crying, or singing simultaneously. Our goal is to recognize polite speech phrases such as Good job , Thank you , Please , and You\u27re welcome , as the occurrence of such speech is one of the behavioral markers used in classroom observation coding via the Classroom Assessment Scoring System (CLASS) protocol. Commercial speech recognition services such as Google Cloud Speech are impractical because of data privacy concerns. Therefore, we train and test our own custom models using a combination of publicly available classroom videos from YouTube, as well as a private dataset of real classroom observation videos collected by our colleagues at the University of Virginia. We also crowdsource an additional 1152 recordings of polite speech phrases to augment our training dataset. Our contributions are the following: (1) we design a crowdsourcing task for efficiently labeling speech events in classroom videos, (2) we develop a neural network-based architecture for speech recognition, robust to noise and overlapping speech, and (3) we explore methods to synthesize new and authentic audio data, both to increase the training set size and reduce the class imbalance. Finally, using our trained polite speech detector, (4) we investigate the relationship between polite speech and CLASS scores and enable teachers to visualize their use of polite language

    Group-Level Emotion Recognition Using a Unimodal Privacy-Safe Non-Individual Approach

    Get PDF
    This article presents our unimodal privacy-safe and non-individual proposal for the audio-video group emotion recognition subtask at the Emotion Recognition in the Wild (EmotiW) Challenge 2020 1. This sub challenge aims to classify in the wild videos into three categories: Positive, Neutral and Negative. Recent deep learning models have shown tremendous advances in analyzing interactions between people, predicting human behavior and affective evaluation. Nonetheless, their performance comes from individual-based analysis, which means summing up and averaging scores from individual detections, which inevitably leads to some privacy issues. In this research, we investigated a frugal approach towards a model able to capture the global moods from the whole image without using face or pose detection, or any individual-based feature as input. The proposed methodology mixes state-of-the-art and dedicated synthetic corpora as training sources. With an in-depth exploration of neural network architectures for group-level emotion recognition, we built a VGG-based model achieving 59.13% accuracy on the VGAF test set (eleventh place of the challenge). Given that the analysis is unimodal based only on global features and that the performance is evaluated on a real-world dataset, these results are promising and let us envision extending this model to multimodality for classroom ambiance evaluation, our final target application

    Use of automated coding methods to assess motivational behaviour in education

    Get PDF
    Teachers’ motivational behaviour is related to important student outcomes. Assessing teachers’ motivational behaviour has been helpful to improve teaching quality and enhance student outcomes. However, researchers in educational psychology have relied on self-report or observer ratings. These methods face limitations on accurately and reliably assessing teachers’ motivational behaviour; thus restricting the pace and scale of conducting research. One potential method to overcome these restrictions is automated coding methods. These methods are capable of analysing behaviour at a large scale with less time and at low costs. In this thesis, I conducted three studies to examine the applications of an automated coding method to assess teacher motivational behaviours. First, I systematically reviewed the applications of automated coding methods used to analyse helping professionals’ interpersonal interactions using their verbal behaviour. The findings showed that automated coding methods were used in psychotherapy to predict the codes of a well-developed behavioural coding measure, in medical settings to predict conversation patterns or topics, and in education to predict simple concepts, such as the number of open/closed questions or class activity type (e.g., group work or teacher lecturing). In certain circumstances, these models achieved near human level performance. However, few studies adhered to best-practice machine learning guidelines. Second, I developed a dictionary of teachers’ motivational phrases and used it to automatically assess teachers’ motivating and de-motivating behaviours. Results showed that the dictionary ratings of teacher need support achieved a strong correlation with observer ratings of need support (rfull dictionary = .73). Third, I developed a classification of teachers’ motivational behaviour that would enable more advanced automated coding of teacher behaviours at each utterance level. In this study, I created a classification that includes 57 teacher motivating and de-motivating behaviours that are consistent with self-determination theory. Automatically assessing teachers’ motivational behaviour with automatic coding methods can provide accurate, fast pace, and large scale analysis of teacher motivational behaviour. This could allow for immediate feedback and also development of theoretical frameworks. The findings in this thesis can lead to the improvement of student motivation and other consequent student outcomes

    Eye on Collaborative Creativity : Insights From Multiple-Person Mobile Gaze Tracking in the Context of Collaborative Design

    Get PDF
    Early Career WorkshopNon peer reviewe

    The NCTE Transcripts: A Dataset of Elementary Math Classroom Transcripts

    Full text link
    Classroom discourse is a core medium of instruction -- analyzing it can provide a window into teaching and learning as well as driving the development of new tools for improving instruction. We introduce the largest dataset of mathematics classroom transcripts available to researchers, and demonstrate how this data can help improve instruction. The dataset consists of 1,660 45-60 minute long 4th and 5th grade elementary mathematics observations collected by the National Center for Teacher Effectiveness (NCTE) between 2010-2013. The anonymized transcripts represent data from 317 teachers across 4 school districts that serve largely historically marginalized students. The transcripts come with rich metadata, including turn-level annotations for dialogic discourse moves, classroom observation scores, demographic information, survey responses and student test scores. We demonstrate that our natural language processing model, trained on our turn-level annotations, can learn to identify dialogic discourse moves and these moves are correlated with better classroom observation scores and learning outcomes. This dataset opens up several possibilities for researchers, educators and policymakers to learn about and improve K-12 instruction. The data and its terms of use can be accessed here: https://github.com/ddemszky/classroom-transcript-analysi
    • 

    corecore