13 research outputs found

    Automated Speech Act Classification in Tutorial Dialogue

    Get PDF
    Speech act classification is the task of detecting speakers\u27 intentions in discourse. Speech acts are based on the language as action theory according to which when we say something we do something. Speech act classification has various application in natural language processing and dialogue-based intelligent systems. In this thesis, we propose machine learning models for speech act classification that account for both content of the current utterance and context (previous utterances) of dialogue and we present this work on two domains: human-human tutoring sessions and multi-party chat based intelligent tutoring systems. The proposed speech act classification models were trained and tested on chat utterances extracted from the tutoring sessions and based on the domain specific properties of the datasets were designed to work with hierarchical and granular speech act taxonomies

    Automated Session-Quality Assessment for Human Tutoring Based on Expert Ratings of Tutoring Success

    No full text
    ABSTRACT Archived transcripts from tens of millions of online human tutoring sessions potentially contain important knowledge about how online tutors help, or fail to help, students learn. However, without ways of automatically analyzing these large corpora, any knowledge in this data will remain buried. One way to approach this issue is to train an estimator for the learning effectiveness of an online tutoring interaction. While significant work has been done on automated assessment of student responses and artifacts (e.g., essays), automated assessment has not traditionally automated assessments of human-to-human tutoring sessions. In this work, we trained a model for estimating tutoring session quality based on a corpus of 1438 online tutoring sessions rated by expert tutors. Each session was rated for evidence of learning (outcomes) and educational soundness (process). Session features for this model included dialog act classifications, mode classifications (e.g., Scaffolding), statistically distinctive subsequences of such classifications, dialog initiative (e.g., statements by tutor vs. student), and session length. The model correlated more highly with evidence of learning than educational soundness ratings, in part due to the greater difficulty of classifying tutoring modes. This model was then applied to a corpus of 242k online tutoring sessions, to examine the relationships between automated assessments and other available metadata (e.g., the tutor's self-assessment). On this large corpus, the automated assessments followed similar patterns as the expert rater's assessments, but with lower overall correlation strength. Based on the analyses presented, the assessment model for online tutoring sessions emulates the ratings of expert human tutors for session quality ratings with a reasonable degree of accuracy

    A tool for speech act classification using interactive machine learning

    No full text
    In this Demo, we introduce a tool that provides a GUI interface to a previously designed to Speech Act Classifier. The tool also provides features to manually annotate data by human and evaluate and improve the automated classifier. We describe the interface and evaluate our model with results from two human judges and Computer

    Predicting performance behaviors during question generation in a game-like intelligent tutoring system

    No full text
    The present research investigates learning constructs predicting performance behaviors during question generation in a serious game known as Operation ARA. In a between-subjects design, undergraduate students (N=66) completed the three teaching modules of the game, teaching the basic factual information, application of knowledge, and finally question generation about scientific research cases. Results suggest that constructs such as time-on-task, discrimination, and generation along with type of instruction (factual vs. applied) impact student behaviors during question generation

    Assessing the dialogic properties of classroom discourse: Proportion models for imbalanced classes

    No full text
    Automatic assessment of dialogic properties of classroom discourse would benefit several widespread classroom observation protocols. However, in classrooms with low incidences of dialogic discourse, assessment can be highly biased against detecting dialogic properties. In this paper, we present an approach to addressing this imbalanced class problem. Rather than perform classifications at the utterance level, we aggregate feature vectors to classify proportions of dialogic properties at the class-session level and achieve a moderate correlation with actual proportions, r(130) = .50, p \u3c .001, CI95[.36,.61] . We show that this approach outperforms aggregating utterance level classifications, r(130) = .27, p = .001, CI95[.11,.43], is stable for both low and high dialogic classrooms, and is stable across both automatic speech recognition and human transcripts

    Seismic features and automatic discrimination of deep and shallow induced-microearthquakes using neural network and logistic regression

    No full text
    We develop an automated strategy for discriminating deep microseismic events from shallow ones on the basis of thewaveforms recorded on a limited number of surface receivers.Machinelearning techniques are employed to explore the relationship between event hypocentres and seismic features of the recorded signals in time, frequency and time-frequency domains. We applied the technique to 440 microearthquakes-1.7 1.29, induced by an underground cavern collapse in the Napoleonville Salt Dome in Bayou Corne, Louisiana. Forty different seismic attributes ofwhole seismograms including degree of polarization and spectral attributes were measured. A selected set of features was then used to train the system to discriminate between deep and shallow events based on the knowledge gained from existing patterns. The cross-validation test showed that events with depth shallower than 250 m can be discriminated from events with hypocentral depth between 1000 and 2000 m with 88 per cent and 90.7 per cent accuracy using logistic regression and artificial neural network models, respectively. Similar results were obtained using single station seismograms. The results show that the spectral features have the highest correlation to source depth. Spectral centroids and 2-D crosscorrelations in the time-frequency domain are two new seismic features used in this study that showed to be promising measures for seismic event classification. The used machine-learning techniques have application for efficient automatic classification of lowenergy signals recorded at one or more seismic stations

    Evaluation dataset (dt-grade) andwordweighting approach towards constructed short answers assessment in tutorial dialogue context

    No full text
    Evaluating student answers often requires contextual information, such as previous utterances in conversational tutoring systems. For example, students use coreferences and write elliptical responses, i.e. incomplete but can be interpreted in context. The DT-Grade corpus which we present in this paper consists of short constructed answers extracted from tutorial dialogues between students and an Intelligent Tutoring System and annotated for their correctness in the given context and whether the contextual information was useful. The dataset contains 900 answers (of which about 25% required contextual information to properly interpret them). We also present a baseline system developed to predict the correctness label (such as correct, correct but incomplete) in which weights for the words are assigned based on context

    Technologies for automated analysis of co-located, real-life, physical learning spaces : where are we now?

    No full text
    The motivation for this paper is derived from the fact that there has been increasing interest among researchers and practitioners in developing technologies that capture, model and analyze learning and teaching experiences that take place beyond computer-based learning environments. In this paper, we review case studies of tools and technologies developed to collect and analyze data in educational settings, quantify learning and teaching processes and support assessment of learning and teaching in an automated fashion. We focus on pipelines that leverage information and data harnessed from physical spaces and/or integrates collected data across physical and digital spaces. Our review reveals a promising field of physical classroom analysis. We describe some trends and suggest potential future directions. Specifically, more research should be geared towards a) deployable and sustainable data collection set-ups in physical learning environments, b) teacher assessment, c) developing feedback and visualization systems and d) promoting inclusivity and generalizability of models across populations.Nanyang Technological UniversityAccepted versionThis project is supported by a grant from Centre for Research and Development in Learning (CRADLE@NTU)

    Multimodal capture of teacher-student interactions for automated dialogic analysis in live classrooms

    No full text
    We focus on data collection designs for the automated analysis of teacher-student interactions in live classrooms with the goal of identifying instructional activities (e.g., lecturing, discussion) and assessing the quality of dialogic instruction (e.g., analysis of questions). Our designs were motivated by multiple technical requirements and constraints. Most importantly, teachers could be individually mic\u27ed but their audio needed to be of excellent quality for automatic speech recognition (ASR) and spoken utterance segmentation. Individual students could not be mic\u27ed but classroom audio quality only needed to be sufficient to detect student spoken utterances. Visual information could only be recorded if students could not be identified. Design 1 used an omnidirectional laptop microphone to record both teacher and classroom audio and was quickly deemed unsuitable. In Designs 2 and 3, teachers wore a wireless Samson AirLine 77 vocal headset system, which is a unidirectional microphone with a cardioid pickup pattern. In Design 2, classroom audio was recorded with dual firstgeneration Microsoft Kinects placed at the front corners of the class. Design 3 used a Crown PZM-30D pressure zone microphone mounted on the blackboard to record classroom audio. Designs 2 and 3 were tested by recording audio in 38 live middle school classrooms from six U.S. schools while trained human coders simultaneously performed live coding of classroom discourse. Qualitative and quantitative analyses revealed that Design 3 was suitable for three of our core tasks: (1) ASR on teacher speech (word recognition rate of 66% and word overlap rate of 69% using Google Speech ASR engine); (2) teacher utterance segmentation (F-measure of 97%); and (3) student utterance segmentation (F-measure of 66%). Ideas to incorporate video and skeletal tracking with dual second-generation Kinects to produce Design 4 are discussed

    Identifying Teacher Questions Using Automatic Speech Recognition in Classrooms

    No full text
    We investigate automatic question detection from recordings of teacher speech collected in live classrooms. Our corpus contains audio recordings of 37 class sessions taught by 11 teachers. We automatically segment teacher speech into utterances using an amplitude envelope thresholding approach followed by filtering non-speech via automatic speech recognition (ASR). We manually code the segmented utterances as containing a teacher question or not based on an empirically-validated scheme for coding classroom discourse. We compute domain-independent natural language processing (NLP) features from transcripts generated by three ASR engines (AT&T, Bing Speech, and Azure Speech). Our teacher-independent supervised machine learning model detects questions with an overall weighted F1 score of 0.59, a 51% improvement over chance. Furthermore, the proportion of automatically-detected questions per class session strongly correlates (Pearson’s r = 0.85) with human-coded question rates. We consider our results to reflect a substantial (37%) improvement over the state-of-the-art in automatic question detection from naturalistic audio. We conclude by discussing applications of our work for teachers, researchers, and other stakeholders
    corecore