687 research outputs found
Actor-Transformers for Group Activity Recognition
This paper strives to recognize individual actions and group activities from
videos. While existing solutions for this challenging problem explicitly model
spatial and temporal relationships based on location of individual actors, we
propose an actor-transformer model able to learn and selectively extract
information relevant for group activity recognition. We feed the transformer
with rich actor-specific static and dynamic representations expressed by
features from a 2D pose network and 3D CNN, respectively. We empirically study
different ways to combine these representations and show their complementary
benefits. Experiments show what is important to transform and how it should be
transformed. What is more, actor-transformers achieve state-of-the-art results
on two publicly available benchmarks for group activity recognition,
outperforming the previous best published results by a considerable margin.Comment: CVPR 202
Group Activity Recognition on Outdoor Scenes
In this research study, we propose an automatic group activity recognition approach by modelling the interdependencies of group activity features over time. Unlike in simple human activity recognition approaches, the distinguishing characteristics of group activities are often determined by how the movement of people are influenced by one another. We propose to model the group interdependences in both motion and location spaces. These spaces are extended to time-space and time-movement spaces and modelled us- ing Kernel Density Estimation (KDE). Such representations are then fed into a machine learning classifier which iden- tifies the group activity. Unlike other approaches to group activity recognition, we do not rely on the manual annota- tion of pedestrian tracks from the video sequence
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily Living
Domain shifts, such as appearance changes, are a key challenge in real-world
applications of activity recognition models, which range from assistive
robotics and smart homes to driver observation in intelligent vehicles. For
example, while simulations are an excellent way of economical data collection,
a Synthetic-to-Real domain shift leads to a > 60% drop in accuracy when
recognizing activities of Daily Living (ADLs). We tackle this challenge and
introduce an activity domain generation framework which creates novel ADL
appearances (novel domains) from different existing activity modalities (source
domains) inferred from video training data. Our framework computes human poses,
heatmaps of body joints, and optical flow maps and uses them alongside the
original RGB videos to learn the essence of source domains in order to generate
completely new ADL domains. The model is optimized by maximizing the distance
between the existing source appearances and the generated novel appearances
while ensuring that the semantics of an activity is preserved through an
additional classification loss. While source data multimodality is an important
concept in this design, our setup does not rely on multi-sensor setups, (i.e.,
all source modalities are inferred from a single video only.) The newly created
activity domains are then integrated in the training of the ADL classification
networks, resulting in models far less susceptible to changes in data
distributions. Extensive experiments on the Synthetic-to-Real benchmark
Sims4Action demonstrate the potential of the domain generation paradigm for
cross-domain ADL recognition, setting new state-of-the-art results. Our code is
publicly available at https://github.com/Zrrr1997/syn2real_DGComment: 8 pages, 7 figures, to be published in IROS 202
A Survey on Visual Analytics of Social Media Data
The unprecedented availability of social media data offers substantial opportunities for data owners, system operators, solution providers, and end users to explore and understand social dynamics. However, the exponential growth in the volume, velocity, and variability of social media data prevents people from fully utilizing such data. Visual analytics, which is an emerging research direction, ha..
Educational Environments with Cultural and Religious Diversity: Psychometric Analysis of the Cyberbullying Scale
The objective of this research is to adapt and validate a useful instrument to diagnose
cyberbullying, provoked by intolerance towards cultural and religious diversity, identifying the profile
of the aggressor and the victim. The study was carried out using the Delphi technique, exploratory
factor analysis (EFA), and confirmatory factor analysis (CFA). The selected sample was composed
of 1478 adolescents, all students from Compulsory Secondary Education of Spain. The instrument
items were extracted from relevant scales on the topic. The initial questionnaire was composed of
52 items and three underlying constructs. After validation with EFA (n = 723), the structure was
checked, and the model was later corroborated with CFA (n = 755) through structural equations
(RMSEA = 0.05, CFI = 0.826, TLI = 0.805). The reliability and internal consistency of the instrument
were also tested, with values for all dimensions being higher than 0.8. It is concluded that this new
questionnaire has 38 items and three dimensions. It has an acceptable validity and reliability, and can
be used to diagnose cyberbullying caused by the non-acceptance of cultural and religious diversity in
Compulsory Secondary Education students.Part of this work has been funded by the Research Project Competitive “Values for intercultural
coexistence in the students of the Autonomous City of Melilla. An intervention proposal”
- …