7 research outputs found
Unsupervised Model Selection for Time-series Anomaly Detection
Anomaly detection in time-series has a wide range of practical applications.
While numerous anomaly detection methods have been proposed in the literature,
a recent survey concluded that no single method is the most accurate across
various datasets. To make matters worse, anomaly labels are scarce and rarely
available in practice. The practical problem of selecting the most accurate
model for a given dataset without labels has received little attention in the
literature. This paper answers this question i.e. Given an unlabeled dataset
and a set of candidate anomaly detectors, how can we select the most accurate
model? To this end, we identify three classes of surrogate (unsupervised)
metrics, namely, prediction error, model centrality, and performance on
injected synthetic anomalies, and show that some metrics are highly correlated
with standard supervised anomaly detection performance metrics such as the
score, but to varying degrees. We formulate metric combination with
multiple imperfect surrogate metrics as a robust rank aggregation problem. We
then provide theoretical justification behind the proposed approach.
Large-scale experiments on multiple real-world datasets demonstrate that our
proposed unsupervised approach is as effective as selecting the most accurate
model based on partially labeled data.Comment: Accepted at International Conference on Learning Representations
(ICLR) 2023 with a notable-top-25% recommendation. Reviewer, AC and author
discussion available at https://openreview.net/forum?id=gOZ_pKANaP
Discriminating Cognitive Disequilibrium and Flow in Problem Solving: A Semi-Supervised Approach Using Involuntary Dynamic Behavioral Signals
Problem solving is one of the most important 21st century skills. However, effectively coaching young students in problem solving is challenging because teachers must continuously monitor their cognitive and affective states, and make real-time pedagogical interventions to maximize their learning outcomes. It is an even more challenging task in social environments with limited human coaching resources. To lessen the cognitive load on a teacher and enable affect-sensitive intelligent tutoring, many researchers have investigated automated cognitive and affective detection methods. However, most of the studies use culturally-sensitive indices of affect that are prone to social editing such as facial expressions, and only few studies have explored involuntary dynamic behavioral signals such as gross body movements. In addition, most current methods rely on expensive labelled data from trained annotators for supervised learning. In this paper, we explore a semi-supervised learning framework that can learn low-dimensional representations of involuntary dynamic behavioral signals (mainly gross-body movements) from a modest number of short time series segments. Experiments on a real-world dataset reveal a significant advantage of these representations in discriminating cognitive disequilibrium and flow, as compared to traditional complexity measures from dynamical systems literature, and demonstrate their potential in transferring learned models to previously unseen subjects
What’s Most Broken? A Tool to Assist Data-Driven Iterative Improvement of an Intelligent Tutoring System
Intelligent Tutoring Systems (ITS) have great potential to change the educational landscape by bringing scientifically tested one-to-one tutoring to remote and under-served areas. However, effective ITSs are too complex to perfect. Instead, a practical guiding principle for ITS development and improvement is to fix what’s most broken. In this paper we present SPOT (Statistical Probe of Tutoring): a tool that mines data logged by an Intelligent Tutoring System to identify the ‘hot spots’ most detrimental to its efficiency and effectiveness in terms of its software reliability, usability, task difficulty, student engagement, and other criteria. SPOT uses heuristics and machine learning to discover, characterize, and prioritize such hot spots in order to focus ITS refinement on what matters most. We applied SPOT to data logged by RoboTutor, an ITS that teaches children basic reading, writing and arithmetic
A Multi-Task Approach to Open Domain Suggestion Mining (Student Abstract)
Consumer reviews online may contain suggestions useful for improving the target products and services. Mining suggestions is challenging because the field lacks large labelled and balanced datasets. Furthermore, most prior studies have only focused on mining suggestions in a single domain. In this work, we introduce a novel up-sampling technique to address the problem of class imbalance, and propose a multi-task deep learning approach for mining suggestions from multiple domains. Experimental results on a publicly available dataset show that our up-sampling technique coupled with the multi-task framework outperforms state-of-the-art open domain suggestion mining models in terms of the F-1 measure and AUC
Modeling Involuntary Dynamic Behaviors to Support Intelligent Tutoring (Student Abstract)
Problem solving is one of the most important 21st century skills. However, effectively coaching young students in problem solving is challenging because teachers must continuously monitor their cognitive and affective states and make real-time pedagogical interventions to maximize students' learning outcomes. It is an even more challenging task in social environments with limited human coaching resources. To lessen the cognitive load on a teacher and enable affect-sensitive intelligent tutoring, many researchers have investigated automated cognitive and affective detection methods. However, most of the studies use culturally-sensitive indices of affect that are prone to social editing such as facial expressions, and only few studies have explored involuntary dynamic behavioral signals such as gross body movements. In addition, most current methods rely on expensive labelled data from trained annotators for supervised learning. In this paper, we explore a semi-supervised learning framework that can learn low-dimensional representations of involuntary dynamic behavioral signals (mainly gross-body movements) from a modest number of short time series segments. Experiments on a real-world dataset reveal a significant utility of these representations in discriminating cognitive disequilibrium and flow and demonstrate their potential in transferring learned models to previously unseen subjects