101 research outputs found
Exploration of Adolescent Depression Risk Prediction Based on Census Surveys and General Life Issues
In contemporary society, the escalating pressures of life and work have
propelled psychological disorders to the forefront of modern health concerns,
an issue that has been further accentuated by the COVID-19 pandemic. The
prevalence of depression among adolescents is steadily increasing, and
traditional diagnostic methods, which rely on scales or interviews, prove
particularly inadequate for detecting depression in young people. Addressing
these challenges, numerous AI-based methods for assisting in the diagnosis of
mental health issues have emerged. However, most of these methods center around
fundamental issues with scales or use multimodal approaches like facial
expression recognition. Diagnosis of depression risk based on everyday habits
and behaviors has been limited to small-scale qualitative studies. Our research
leverages adolescent census data to predict depression risk, focusing on
children's experiences with depression and their daily life situations. We
introduced a method for managing severely imbalanced high-dimensional data and
an adaptive predictive approach tailored to data structure characteristics.
Furthermore, we proposed a cloud-based architecture for automatic online
learning and data updates. This study utilized publicly available NSCH youth
census data from 2020 to 2022, encompassing nearly 150,000 data entries. We
conducted basic data analyses and predictive experiments, demonstrating
significant performance improvements over standard machine learning and deep
learning algorithms. This affirmed our data processing method's broad
applicability in handling imbalanced medical data. Diverging from typical
predictive method research, our study presents a comprehensive architectural
solution, considering a wider array of user needs
Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation
Video scene graph generation (VidSGG) aims to identify objects in visual
scenes and infer their relationships for a given video. It requires not only a
comprehensive understanding of each object scattered on the whole scene but
also a deep dive into their temporal motions and interactions. Inherently,
object pairs and their relationships enjoy spatial co-occurrence correlations
within each image and temporal consistency/transition correlations across
different images, which can serve as prior knowledge to facilitate VidSGG model
learning and inference. In this work, we propose a spatial-temporal
knowledge-embedded transformer (STKET) that incorporates the prior
spatial-temporal knowledge into the multi-head cross-attention mechanism to
learn more representative relationship representations. Specifically, we first
learn spatial co-occurrence and temporal transition correlations in a
statistical manner. Then, we design spatial and temporal knowledge-embedded
layers that introduce the multi-head cross-attention mechanism to fully explore
the interaction between visual representation and the knowledge to generate
spatial- and temporal-embedded representations, respectively. Finally, we
aggregate these representations for each subject-object pair to predict the
final semantic labels and their relationships. Extensive experiments show that
STKET outperforms current competing algorithms by a large margin, e.g.,
improving the mR@50 by 8.1%, 4.7%, and 2.1% on different settings over current
algorithms.Comment: Technical Repor
- …