21,700 research outputs found
SCB-dataset: A Dataset for Detecting Student Classroom Behavior
The use of deep learning methods for automatic detection of students'
classroom behavior is a promising approach to analyze their class performance
and enhance teaching effectiveness. However, the lack of publicly available
datasets on student behavior poses a challenge for researchers in this field.
To address this issue, we propose a Student Classroom Behavior dataset
(SCB-dataset) that reflects real-life scenarios. Our dataset includes 11,248
labels and 4,003 images, with a focus on hand-raising behavior. We evaluated
the dataset using the YOLOv7 algorithm, achieving a mean average precision
(map) of up to 85.3%. We believe that our dataset can serve as a robust
foundation for future research in the field of student behavior detection and
promote further advancements in this area.Our SCB-dataset can be downloaded
from: https://github.com/Whiffe/SCB-datase
Student Classroom Behavior Detection based on Improved YOLOv7
Accurately detecting student behavior in classroom videos can aid in
analyzing their classroom performance and improving teaching effectiveness.
However, the current accuracy rate in behavior detection is low. To address
this challenge, we propose the Student Classroom Behavior Detection method,
based on improved YOLOv7. First, we created the Student Classroom Behavior
dataset (SCB-Dataset), which includes 18.4k labels and 4.2k images, covering
three behaviors: hand raising, reading, and writing. To improve detection
accuracy in crowded scenes, we integrated the biformer attention module and
Wise-IoU into the YOLOv7 network. Finally, experiments were conducted on the
SCB-Dataset, and the model achieved an [email protected] of 79%, resulting in a 1.8%
improvement over previous results. The SCB-Dataset and code are available for
download at: https://github.com/Whiffe/SCB-dataset.Comment: arXiv admin note: text overlap with arXiv:2305.0782
Student Classroom Behavior Detection based on Spatio-Temporal Network and Multi-Model Fusion
Using deep learning methods to detect students' classroom behavior
automatically is a promising approach for analyzing their class performance and
improving teaching effectiveness. However, the lack of publicly available
spatio-temporal datasets on student behavior, as well as the high cost of
manually labeling such datasets, pose significant challenges for researchers in
this field. To address this issue, we proposed a method for extending the
spatio-temporal behavior dataset in Student Classroom Scenarios
(SCB-ST-Dataset4) through image dataset. Our SCB-ST-Dataset4 comprises 757265
images with 25810 labels, focusing on 3 behaviors: hand-raising, reading,
writing. Our proposed method can rapidly generate spatio-temporal behavior
datasets without requiring extra manual labeling. Furthermore, we proposed a
Behavior Similarity Index (BSI) to explore the similarity of behaviors. We
evaluated the dataset using the YOLOv5, YOLOv7, YOLOv8, and SlowFast
algorithms, achieving a mean average precision (map) of up to 82.3%. Last, we
fused multiple models to generate student behavior-related data from various
perspectives. The experiment further demonstrates the effectiveness of our
method. And SCB-ST-Dataset4 provides a robust foundation for future research in
student behavior detection, potentially contributing to advancements in this
field. The SCB-ST-Dataset4 is available for download at:
https://github.com/Whiffe/SCB-dataset.Comment: arXiv admin note: substantial text overlap with arXiv:2310.02522;
text overlap with arXiv:2306.0331
An Immersive Telepresence System using RGB-D Sensors and Head Mounted Display
We present a tele-immersive system that enables people to interact with each
other in a virtual world using body gestures in addition to verbal
communication. Beyond the obvious applications, including general online
conversations and gaming, we hypothesize that our proposed system would be
particularly beneficial to education by offering rich visual contents and
interactivity. One distinct feature is the integration of egocentric pose
recognition that allows participants to use their gestures to demonstrate and
manipulate virtual objects simultaneously. This functionality enables the
instructor to ef- fectively and efficiently explain and illustrate complex
concepts or sophisticated problems in an intuitive manner. The highly
interactive and flexible environment can capture and sustain more student
attention than the traditional classroom setting and, thus, delivers a
compelling experience to the students. Our main focus here is to investigate
possible solutions for the system design and implementation and devise
strategies for fast, efficient computation suitable for visual data processing
and network transmission. We describe the technique and experiments in details
and provide quantitative performance results, demonstrating our system can be
run comfortably and reliably for different application scenarios. Our
preliminary results are promising and demonstrate the potential for more
compelling directions in cyberlearning.Comment: IEEE International Symposium on Multimedia 201
A Spatio-Temporal Attention-Based Method for Detecting Student Classroom Behaviors
Accurately detecting student behavior from classroom videos is beneficial for
analyzing their classroom status and improving teaching efficiency. However,
low accuracy in student classroom behavior detection is a prevalent issue. To
address this issue, we propose a Spatio-Temporal Attention-Based Method for
Detecting Student Classroom Behaviors (BDSTA). Firstly, the SlowFast network is
used to generate motion and environmental information feature maps from the
video. Then, the spatio-temporal attention module is applied to the feature
maps, including information aggregation, compression and stimulation processes.
Subsequently, attention maps in the time, channel and space dimensions are
obtained, and multi-label behavior classification is performed based on these
attention maps. To solve the long-tail data problem that exists in student
classroom behavior datasets, we use an improved focal loss function to assign
more weight to the tail class data during training. Experimental results are
conducted on a self-made student classroom behavior dataset named STSCB.
Compared with the SlowFast model, the average accuracy of student behavior
classification detection improves by 8.94\% using BDSTA
StuArt: Individualized Classroom Observation of Students with Automatic Behavior Recognition and Tracking
Each student matters, but it is hardly for instructors to observe all the
students during the courses and provide helps to the needed ones immediately.
In this paper, we present StuArt, a novel automatic system designed for the
individualized classroom observation, which empowers instructors to concern the
learning status of each student. StuArt can recognize five representative
student behaviors (hand-raising, standing, sleeping, yawning, and smiling) that
are highly related to the engagement and track their variation trends during
the course. To protect the privacy of students, all the variation trends are
indexed by the seat numbers without any personal identification information.
Furthermore, StuArt adopts various user-friendly visualization designs to help
instructors quickly understand the individual and whole learning status.
Experimental results on real classroom videos have demonstrated the superiority
and robustness of the embedded algorithms. We expect our system promoting the
development of large-scale individualized guidance of students.Comment: Novel pedagogical approaches in signal processing for K-12 educatio
Understanding face and eye visibility in front-facing cameras of smartphones used in the wild
Commodity mobile devices are now equipped with high-resolution front-facing cameras, allowing applications in biometrics (e.g., FaceID in the iPhone X), facial expression analysis, or gaze interaction. However, it is unknown how often users hold devices in a way that allows capturing their face or eyes, and how this impacts detection accuracy. We collected 25,726 in-the-wild photos, taken from the front-facing camera of smartphones as well as associated application usage logs. We found that the full face is visible about 29% of the time, and that in most cases the face is only partially visible. Furthermore, we identified an influence of users' current activity; for example, when watching videos, the eyes but not the entire face are visible 75% of the time in our dataset. We found that a state-of-the-art face detection algorithm performs poorly against photos taken from front-facing cameras. We discuss how these findings impact mobile applications that leverage face and eye detection, and derive practical implications to address state-of-the art's limitations
A time series feature of variability to detect two types of boredom from motion capture of the head and shoulders
Boredom and disengagement metrics are crucial to the correctly timed implementation of adaptive interventions in interactive systems. psychological research suggests that boredom (which other HCI teams have been able to partially quantify with pressure-sensing chair mats) is actually a composite: lethargy and restlessness. Here we present an innovative approach to the measurement and recognition of these two kinds of boredom, based on motion capture and video analysis of changes in head and shoulder positions. Discrete, three-minute, computer-presented stimuli (games, quizzes, films and music) covering a spectrum from engaging to boring/disengaging were used to elicit changes in cognitive/emotional states in seated, healthy volunteers. Interaction with the stimuli occurred with a handheld trackball instead of a mouse, so movements were assumed to be non-instrumental. Our results include a feature (standard deviation of windowed ranges) that may be more specific to boredom than mean speed of head movement, and that could be implemented in computer vision algorithms for disengagement detection
- …