87 research outputs found

    PsyMo: A Dataset for Estimating Self-Reported Psychological Traits from Gait

    Full text link
    Psychological trait estimation from external factors such as movement and appearance is a challenging and long-standing problem in psychology, and is principally based on the psychological theory of embodiment. To date, attempts to tackle this problem have utilized private small-scale datasets with intrusive body-attached sensors. Potential applications of an automated system for psychological trait estimation include estimation of occupational fatigue and psychology, and marketing and advertisement. In this work, we propose PsyMo (Psychological traits from Motion), a novel, multi-purpose and multi-modal dataset for exploring psychological cues manifested in walking patterns. We gathered walking sequences from 312 subjects in 7 different walking variations and 6 camera angles. In conjunction with walking sequences, participants filled in 6 psychological questionnaires, totalling 17 psychometric attributes related to personality, self-esteem, fatigue, aggressiveness and mental health. We propose two evaluation protocols for psychological trait estimation. Alongside the estimation of self-reported psychological traits from gait, the dataset can be used as a drop-in replacement to benchmark methods for gait recognition. We anonymize all cues related to the identity of the subjects and publicly release only silhouettes, 2D / 3D human skeletons and 3D SMPL human meshes

    GaitPT: Skeletons Are All You Need For Gait Recognition

    Full text link
    The analysis of patterns of walking is an important area of research that has numerous applications in security, healthcare, sports and human-computer interaction. Lately, walking patterns have been regarded as a unique fingerprinting method for automatic person identification at a distance. In this work, we propose a novel gait recognition architecture called Gait Pyramid Transformer (GaitPT) that leverages pose estimation skeletons to capture unique walking patterns, without relying on appearance information. GaitPT adopts a hierarchical transformer architecture that effectively extracts both spatial and temporal features of movement in an anatomically consistent manner, guided by the structure of the human skeleton. Our results show that GaitPT achieves state-of-the-art performance compared to other skeleton-based gait recognition works, in both controlled and in-the-wild scenarios. GaitPT obtains 82.6% average accuracy on CASIA-B, surpassing other works by a margin of 6%. Moreover, it obtains 52.16% Rank-1 accuracy on GREW, outperforming both skeleton-based and appearance-based approaches

    RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian

    Full text link
    Recently, large language models (LLMs) have become increasingly powerful and have become capable of solving a plethora of tasks through proper instructions in natural language. However, the vast majority of testing suites assume that the instructions are written in English, the de facto prompting language. Code intelligence and problem solving still remain a difficult task, even for the most advanced LLMs. Currently, there are no datasets to measure the generalization power for code-generation models in a language other than English. In this work, we present RoCode, a competitive programming dataset, consisting of 2,642 problems written in Romanian, 11k solutions in C, C++ and Python and comprehensive testing suites for each problem. The purpose of RoCode is to provide a benchmark for evaluating the code intelligence of language models trained on Romanian / multilingual text as well as a fine-tuning set for pretrained Romanian models. Through our results and review of related works, we argue for the need to develop code models for languages other than English.Comment: Accepted at LREC-COLING 202

    CamLoc: Pedestrian Location Estimation through Body Pose Estimation on Smart Cameras

    Get PDF
    Advances in hardware and algorithms are driving the exponential growth of Internet of Things (IoT), with increasingly more pervasive computations being performed near the data generation sources. With this wave of technology, a range of intelligent devices can perform local inferences (activity recognition, fitness monitoring, etc.), which have obvious advantages: reduced inference latency for interactive (real-time) applications and better data privacy by processing user data locally. Video processing can benefit many applications and data labelling systems, although performing this efficiently at the edge of the Internet is not trivial. In this paper, we show that accurate pedestrian location estimation is achievable using deep neural networks on fixed cameras with limited computing resources. Our approach, CamLoc, uses pose estimation from key body points detection to extend pedestrian skeleton when the entire body is not in view (occluded by obstacles or partially outside the frame). Our evaluation dataset contains over 2100 frames from surveillance cameras (including two cameras simultaneously pointing at the same scene from different angles), in 42 different scenarios of activity and occlusion. We make this dataset available together with annotations indicating the exact 2D position of person in frame as ground-truth information. CamLoc achieves good location estimation accuracy in these complex scenarios with high levels of occlusion, matching the performance of state-of-theart solutions, but using less computing resources and attaining a higher inference throughput

    Reading Between the Frames: Multi-Modal Depression Detection in Videos from Non-Verbal Cues

    Full text link
    Depression, a prominent contributor to global disability, affects a substantial portion of the population. Efforts to detect depression from social media texts have been prevalent, yet only a few works explored depression detection from user-generated video content. In this work, we address this research gap by proposing a simple and flexible multi-modal temporal model capable of discerning non-verbal depression cues from diverse modalities in noisy, real-world videos. We show that, for in-the-wild videos, using additional high-level non-verbal cues is crucial to achieving good performance, and we extracted and processed audio speech embeddings, face emotion embeddings, face, body and hand landmarks, and gaze and blinking information. Through extensive experiments, we show that our model achieves state-of-the-art results on three key benchmark datasets for depression detection from video by a substantial margin. Our code is publicly available on GitHub.Comment: Accepted at 46th European Conference on Information Retrieval (ECIR 2024

    MICROSTRUCTURAL ANALYSIS OF THE INTERFACE BETWEEN SOME SUPERALLOYS AND COMPOSITE/CERAMIC MATERIALS

    Get PDF
    The clinical success of aesthetic ceramic fused to metal or composite resin bonded to metal restorations depends on the quality and strength of composite/ceramic bonding. To investigate the ceramic and composite surface adhesion to the surface of the alloys, samples were prepared by using the metallographic techniques and then were analyzed by Scanning Electron Microscopy (SEM). We studied a total of four samples of superalloys, denoted S1, S2, S3, and S4. Each of these was treated with: Vita ceramic powders, Noritake ceramic powders, Premise Indirect composite and an indigenous composite C1. At a magnification level of x1500, the adherence between the layers and the surface irregularities of the layers that improve the adherence could be properly observed. It is worth noting that after the sample preparation procedure, samples S1, S2 and S4 were damaged, the only sample remaining in a good condition was sample S3
    corecore