687 research outputs found

    Actor-Transformers for Group Activity Recognition

    Get PDF
    This paper strives to recognize individual actions and group activities from videos. While existing solutions for this challenging problem explicitly model spatial and temporal relationships based on location of individual actors, we propose an actor-transformer model able to learn and selectively extract information relevant for group activity recognition. We feed the transformer with rich actor-specific static and dynamic representations expressed by features from a 2D pose network and 3D CNN, respectively. We empirically study different ways to combine these representations and show their complementary benefits. Experiments show what is important to transform and how it should be transformed. What is more, actor-transformers achieve state-of-the-art results on two publicly available benchmarks for group activity recognition, outperforming the previous best published results by a considerable margin.Comment: CVPR 202

    Actor-Transformers for Group Activity Recognition

    Get PDF

    Group Activity Recognition on Outdoor Scenes

    Get PDF
    In this research study, we propose an automatic group activity recognition approach by modelling the interdependencies of group activity features over time. Unlike in simple human activity recognition approaches, the distinguishing characteristics of group activities are often determined by how the movement of people are influenced by one another. We propose to model the group interdependences in both motion and location spaces. These spaces are extended to time-space and time-movement spaces and modelled us- ing Kernel Density Estimation (KDE). Such representations are then fed into a machine learning classifier which iden- tifies the group activity. Unlike other approaches to group activity recognition, we do not rely on the manual annota- tion of pedestrian tracks from the video sequence

    Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily Living

    Full text link
    Domain shifts, such as appearance changes, are a key challenge in real-world applications of activity recognition models, which range from assistive robotics and smart homes to driver observation in intelligent vehicles. For example, while simulations are an excellent way of economical data collection, a Synthetic-to-Real domain shift leads to a > 60% drop in accuracy when recognizing activities of Daily Living (ADLs). We tackle this challenge and introduce an activity domain generation framework which creates novel ADL appearances (novel domains) from different existing activity modalities (source domains) inferred from video training data. Our framework computes human poses, heatmaps of body joints, and optical flow maps and uses them alongside the original RGB videos to learn the essence of source domains in order to generate completely new ADL domains. The model is optimized by maximizing the distance between the existing source appearances and the generated novel appearances while ensuring that the semantics of an activity is preserved through an additional classification loss. While source data multimodality is an important concept in this design, our setup does not rely on multi-sensor setups, (i.e., all source modalities are inferred from a single video only.) The newly created activity domains are then integrated in the training of the ADL classification networks, resulting in models far less susceptible to changes in data distributions. Extensive experiments on the Synthetic-to-Real benchmark Sims4Action demonstrate the potential of the domain generation paradigm for cross-domain ADL recognition, setting new state-of-the-art results. Our code is publicly available at https://github.com/Zrrr1997/syn2real_DGComment: 8 pages, 7 figures, to be published in IROS 202

    A Survey on Visual Analytics of Social Media Data

    Get PDF
    The unprecedented availability of social media data offers substantial opportunities for data owners, system operators, solution providers, and end users to explore and understand social dynamics. However, the exponential growth in the volume, velocity, and variability of social media data prevents people from fully utilizing such data. Visual analytics, which is an emerging research direction, ha..

    Educational Environments with Cultural and Religious Diversity: Psychometric Analysis of the Cyberbullying Scale

    Get PDF
    The objective of this research is to adapt and validate a useful instrument to diagnose cyberbullying, provoked by intolerance towards cultural and religious diversity, identifying the profile of the aggressor and the victim. The study was carried out using the Delphi technique, exploratory factor analysis (EFA), and confirmatory factor analysis (CFA). The selected sample was composed of 1478 adolescents, all students from Compulsory Secondary Education of Spain. The instrument items were extracted from relevant scales on the topic. The initial questionnaire was composed of 52 items and three underlying constructs. After validation with EFA (n = 723), the structure was checked, and the model was later corroborated with CFA (n = 755) through structural equations (RMSEA = 0.05, CFI = 0.826, TLI = 0.805). The reliability and internal consistency of the instrument were also tested, with values for all dimensions being higher than 0.8. It is concluded that this new questionnaire has 38 items and three dimensions. It has an acceptable validity and reliability, and can be used to diagnose cyberbullying caused by the non-acceptance of cultural and religious diversity in Compulsory Secondary Education students.Part of this work has been funded by the Research Project Competitive “Values for intercultural coexistence in the students of the Autonomous City of Melilla. An intervention proposal”
    corecore