26 research outputs found
Multimodal Group Activity Dataset for Classroom Engagement Level Prediction
We collected a new dataset that includes approximately eight hours of
audiovisual recordings of a group of students and their self-evaluation scores
for classroom engagement. The dataset and data analysis scripts are available
on our open-source repository. We developed baseline face-based and
group-activity-based image and video recognition models. Our image models yield
45-85% test accuracy with face-area inputs on person-based classification task.
Our video models achieved up to 71% test accuracy on group-level prediction
using group activity video inputs. In this technical report, we shared the
details of our end-to-end human-centered engagement analysis pipeline from data
collection to model development
Towards Building Child-Centered Machine Learning Pipelines: Use Cases from K-12 and Higher-Education
Researchers and policy-makers have started creating frameworks and guidelines
for building machine-learning (ML) pipelines with a human-centered lens.
Machine Learning pipelines stand for all the necessary steps to develop ML
systems (e.g., developing a predictive keyboard). On the other hand, a
child-centered focus in developing ML systems has been recently gaining
interest as children are becoming users of these products. These efforts
dominantly focus on children's interaction with ML-based systems. However, from
our experience, ML pipelines are yet to be adapted using a child-centered lens.
In this paper, we list the questions we ask ourselves in adapting
human-centered ML pipelines to child-centered ones. We also summarize two case
studies of building end-to-end ML pipelines for children's products
Sketch-based interaction and modeling: where do we stand?
Sketching is a natural and intuitive communication tool used for expressing concepts or ideas which are difficult to communicate through text or speech alone. Sketching is therefore used for a variety of purposes, from the expression of ideas on two-dimensional (2D) physical media, to object creation, manipulation, or deformation in three-dimensional (3D) immersive environments. This variety in sketching activities brings about a range of technologies which, while having similar scope, namely that of recording and interpreting the sketch gesture to effect some interaction, adopt different interpretation approaches according to the environment in which the sketch is drawn. In fields such as product design, sketches are drawn at various stages of the design process, and therefore, designers would benefit from sketch interpretation technologies which support these differing interactions. However, research typically focuses on one aspect of sketch interpretation and modeling such that literature on available technologies is fragmented and dispersed. In this paper, we bring together the relevant literature describing technologies which can support the product design industry, namely technologies which support the interpretation of sketches drawn on 2D media, sketch-based search interactions, as well as sketch gestures drawn in 3D media. This paper, therefore, gives a holistic view of the algorithmic support that can be provided in the design process. In so doing, we highlight the research gaps and future research directions required to provide full sketch-based interaction support
Overview of Recent Work in Pen-Centric Computing Vision of Pen-Based Computing
A major portion of pen-centric research has revolved around the goal of enabling natural human-computer interaction. We believe progress in two areas is critical to achieving the goal of natural sketch-based interfaces. First, we need to improve over the existing recognition algorithms in terms of efficiency and recognition accuracy. Our work in recognizing sketches using temporal patterns that naturally appear in online sketching contributes toward addressing these algorithmic issues. The second issue that requires further exploration is the construction and evaluation of penbased applications that can readily be adopted by the target user groups and immediately integrated into their workflow in the near future. Sketch interpretation component of these applications should be robust enough to allow the deployment of these systems in real usage settings. Toward this end, we have focused on the construction and evaluation of pen-based interfaces for two simple domains: shortest path graphs and probabilistic network diagrams. Below, we describe our recent work on recognition algorithms and readily adoptable pen-based tools. We also point out demonstrations that we would like to share with the workshop participants. Current Research Agenda We believe progress in two areas is critical in achieving the goal of natural sketch-based interfaces. First, we need to improve over the existing recognition algorithms in terms of efficiency and recognition accuracy. Second, we need to construct and evaluate tools with sketch-based interfaces where recognition is robust enough to allow the deploymen
HAISTA-NET: Human Assisted Instance Segmentation Through Attention
Instance segmentation is a form of image detection which has a range of
applications, such as object refinement, medical image analysis, and
image/video editing, all of which demand a high degree of accuracy. However,
this precision is often beyond the reach of what even state-of-the-art, fully
automated instance segmentation algorithms can deliver. The performance gap
becomes particularly prohibitive for small and complex objects. Practitioners
typically resort to fully manual annotation, which can be a laborious process.
In order to overcome this problem, we propose a novel approach to enable more
precise predictions and generate higher-quality segmentation masks for
high-curvature, complex and small-scale objects. Our human-assisted
segmentation model, HAISTA-NET, augments the existing Strong Mask R-CNN network
to incorporate human-specified partial boundaries. We also present a dataset of
hand-drawn partial object boundaries, which we refer to as human attention
maps. In addition, the Partial Sketch Object Boundaries (PSOB) dataset contains
hand-drawn partial object boundaries which represent curvatures of an object's
ground truth mask with several pixels. Through extensive evaluation using the
PSOB dataset, we show that HAISTA-NET outperforms state-of-the art methods such
as Mask R-CNN, Strong Mask R-CNN, and Mask2Former, achieving respective
increases of +36.7, +29.6, and +26.5 points in AP-Mask metrics for these three
models. We hope that our novel approach will set a baseline for future
human-aided deep learning models by combining fully automated and interactive
instance segmentation architectures
Haptic Stylus with Inertial and Vibro-Tactile Feedback
In this paper, we introduce a novel stylus capable of displaying two haptic effects to the user. The first effect is a tactile flow effect up and down along the pen, and the other is a rotation effect about the long axis of the pen. The flow effect is based on the haptic illusion of “apparent tactile motion”, while the rotation effect comes from the reaction torque created by an electric motor placed along the stylus shaft. The stylus is embedded with two vibration actuators at the ends, and a DC motor with a rotating balanced mass in the middle. We show that, it is possible to create flow and rotation effects on the stylus by driving the actuators on the stylus. Furthermore, we show that the timing and the actuation patterns of the vibration actuators and DC motor on the stylus significantly affect the discernibility of the synthesized perceptions; hence these parameters should be selected carefully. Two psychophysical experiments, each performed with 10 subjects, shed light on the discernability of the two haptic effects as a function of various actuation parameters. Our results show that, with carefully selected parameters, the subjects can successfully identify the flow of motion and the direction of rotation with high accuracies
Intention Recognition for Dynamic Role Exchange in Haptic Collaboration
Abstract—In human-computer collaboration involving haptics, a key issue that remains to be solved is to establish an intuitive communication between the partners. Even though computers are widely used to aid human operators in teleoperation, guidance, and training, since they lack the adaptability, versatility, and awareness of a human, their ability to improve efficiency and effectiveness in dynamic tasks is limited. We suggest that the communication between a human and a computer can be improved if it involves a decision making process in which the computer is programmed to infer the intentions of the human operator and dynamically adjust the control levels of the interacting parties to facilitate a more intuitive interaction setup. In this paper, we investigate the utility of such a dynamic role exchange mechanism where partners negotiate through the haptic channel to trade their control levels on a collaborative task. We examine the energy consumption, the work done on the manipulated object, and the joint efficiency in addition to the task performance. We show that when compared to an equal control condition, a role exchange mechanism improves task performance and the joint efficiency of the partners. We also show that augmenting the system with additional informative visual and vibrotactile cues, which are used to display the state of interaction, allows the users to become aware of the underlying role exchange mechanism and utilize it in favor of the task. These cues also improve the user’s sense of interaction and reinforce his/her belief that the computer aids with the execution of the task