11,738 research outputs found
SCB-dataset: A Dataset for Detecting Student Classroom Behavior
The use of deep learning methods for automatic detection of students'
classroom behavior is a promising approach to analyze their class performance
and enhance teaching effectiveness. However, the lack of publicly available
datasets on student behavior poses a challenge for researchers in this field.
To address this issue, we propose a Student Classroom Behavior dataset
(SCB-dataset) that reflects real-life scenarios. Our dataset includes 11,248
labels and 4,003 images, with a focus on hand-raising behavior. We evaluated
the dataset using the YOLOv7 algorithm, achieving a mean average precision
(map) of up to 85.3%. We believe that our dataset can serve as a robust
foundation for future research in the field of student behavior detection and
promote further advancements in this area.Our SCB-dataset can be downloaded
from: https://github.com/Whiffe/SCB-datase
AI-Augmented Monitoring and Management by Image Analysis for Object Detection and Counting
Counting the number of objects from images has become an increasingly important topic in different applications, such as crowd counting, cell microscopy image analyses in biomedical imaging, and horticulture monitoring and prediction. Many studies have been working on automatic object counting with Convolutional Neural Networks (CNNs). This research is aimed to shed more light on the applications of deep learning models to count objects in images in different places, such as growing fields, classrooms, streets, etc. We will study how CNN predicts the numbers of objects and measure the accuracy of trained models with different training parameters by using evaluation metrics, mAP, and RMSE. The performance of object detection and counting using a CNN, YOLOv5, will be analyzed. The model will be trained on the Global Wheat Head Detection 2021 dataset for crop counting and COCO dataset for counting of labeled objects. The performance of the optimized model on crowd counting will be tested with pictures taken on the Texas A&M University campus
Automatic Eye-Gaze Following from 2-D Static Images: Application to Classroom Observation Video Analysis
In this work, we develop an end-to-end neural network-based computer vision system to automatically identify where each person within a 2-D image of a school classroom is looking (“gaze following�), as well as who she/he is looking at. Automatic gaze following could help facilitate data-mining of large datasets of classroom observation videos that are collected routinely in schools around the world in order to understand social interactions between teachers and students. Our network is based on the architecture by Recasens, et al. (2015) but is extended to (1) predict not only where, but who the person is looking at; and (2) predict whether each person is looking at a target inside or outside the image. Since our focus is on classroom observation videos, we collect gaze dataset (48,907 gaze annotations over 2,263 classroom images) for students and teachers in classrooms. Results of our experiments indicate that the proposed neural network can estimate the gaze target - either the spatial location or the face of a person - with substantially higher accuracy compared to several baselines
Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence
Mobile Augmented Reality (MAR) integrates computer-generated virtual objects with physical environments for mobile devices. MAR systems enable users to interact with MAR devices, such as smartphones and head-worn wearables, and perform seamless transitions from the physical world to a mixed world with digital entities. These MAR systems support user experiences using MAR devices to provide universal access to digital content. Over the past 20 years, several MAR systems have been developed, however, the studies and design of MAR frameworks have not yet been systematically reviewed from the perspective of user-centric design. This article presents the first effort of surveying existing MAR frameworks (count: 37) and further discuss the latest studies on MAR through a top-down approach: (1) MAR applications; (2) MAR visualisation techniques adaptive to user mobility and contexts; (3) systematic evaluation of MAR frameworks, including supported platforms and corresponding features such as tracking, feature extraction, and sensing capabilities; and (4) underlying machine learning approaches supporting intelligent operations within MAR systems. Finally, we summarise the development of emerging research fields and the current state-of-the-art, and discuss the important open challenges and possible theoretical and technical directions. This survey aims to benefit both researchers and MAR system developers alike.Peer reviewe
Joint Multi-Person Body Detection and Orientation Estimation via One Unified Embedding
Human body orientation estimation (HBOE) is widely applied into various
applications, including robotics, surveillance, pedestrian analysis and
autonomous driving. Although many approaches have been addressing the HBOE
problem from specific under-controlled scenes to challenging in-the-wild
environments, they assume human instances are already detected and take a well
cropped sub-image as the input. This setting is less efficient and prone to
errors in real application, such as crowds of people. In the paper, we propose
a single-stage end-to-end trainable framework for tackling the HBOE problem
with multi-persons. By integrating the prediction of bounding boxes and
direction angles in one embedding, our method can jointly estimate the location
and orientation of all bodies in one image directly. Our key idea is to
integrate the HBOE task into the multi-scale anchor channel predictions of
persons for concurrently benefiting from engaged intermediate features.
Therefore, our approach can naturally adapt to difficult instances involving
low resolution and occlusion as in object detection. We validated the
efficiency and effectiveness of our method in the recently presented benchmark
MEBOW with extensive experiments. Besides, we completed ambiguous instances
ignored by the MEBOW dataset, and provided corresponding weak body-orientation
labels to keep the integrity and consistency of it for supporting studies
toward multi-persons. Our work is available at
\url{https://github.com/hnuzhy/JointBDOE}
Optimized energy and air quality management of shared smart buildings in the covid-19 scenario
Worldwide increasing awareness of energy sustainability issues has been the main driver in developing the concepts of (Nearly) Zero Energy Buildings, where the reduced energy consumptions are (nearly) fully covered by power locally generated by renewable sources. At the same time, recent advances in Internet of Things technologies are among the main enablers of Smart Homes and Buildings. The transition of conventional buildings into active environments that process, elaborate and react to online measured environmental quantities is being accelerated by the aspects related to COVID-19, most notably in terms of air exchange and the monitoring of the density of occupants. In this paper, we address the problem of maximizing the energy efficiency and comfort perceived by occupants, defined in terms of thermal comfort, visual comfort and air quality. The case study of the University of Pisa is considered as a practical example to show preliminary results of the aggregation of environmental data
SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection
Domain adaptive object detection (DAOD) aims to alleviate transfer
performance degradation caused by the cross-domain discrepancy. However, most
existing DAOD methods are dominated by outdated and computationally intensive
two-stage Faster R-CNN, which is not the first choice for industrial
applications. In this paper, we propose a novel semi-supervised domain adaptive
YOLO (SSDA-YOLO) based method to improve cross-domain detection performance by
integrating the compact one-stage stronger detector YOLOv5 with domain
adaptation. Specifically, we adapt the knowledge distillation framework with
the Mean Teacher model to assist the student model in obtaining instance-level
features of the unlabeled target domain. We also utilize the scene style
transfer to cross-generate pseudo images in different domains for remedying
image-level differences. In addition, an intuitive consistency loss is proposed
to further align cross-domain predictions. We evaluate SSDA-YOLO on public
benchmarks including PascalVOC, Clipart1k, Cityscapes, and Foggy Cityscapes.
Moreover, to verify its generalization, we conduct experiments on yawning
detection datasets collected from various real classrooms. The results show
considerable improvements of our method in these DAOD tasks, which reveals both
the effectiveness of proposed adaptive modules and the urgency of applying more
advanced detectors in DAOD. Our code is available on
\url{https://github.com/hnuzhy/SSDA-YOLO}.Comment: submitted to CVI
Student Classroom Behavior Detection based on Improved YOLOv7
Accurately detecting student behavior in classroom videos can aid in
analyzing their classroom performance and improving teaching effectiveness.
However, the current accuracy rate in behavior detection is low. To address
this challenge, we propose the Student Classroom Behavior Detection method,
based on improved YOLOv7. First, we created the Student Classroom Behavior
dataset (SCB-Dataset), which includes 18.4k labels and 4.2k images, covering
three behaviors: hand raising, reading, and writing. To improve detection
accuracy in crowded scenes, we integrated the biformer attention module and
Wise-IoU into the YOLOv7 network. Finally, experiments were conducted on the
SCB-Dataset, and the model achieved an [email protected] of 79%, resulting in a 1.8%
improvement over previous results. The SCB-Dataset and code are available for
download at: https://github.com/Whiffe/SCB-dataset.Comment: arXiv admin note: text overlap with arXiv:2305.0782
- …