87 research outputs found
PsyMo: A Dataset for Estimating Self-Reported Psychological Traits from Gait
Psychological trait estimation from external factors such as movement and
appearance is a challenging and long-standing problem in psychology, and is
principally based on the psychological theory of embodiment. To date, attempts
to tackle this problem have utilized private small-scale datasets with
intrusive body-attached sensors. Potential applications of an automated system
for psychological trait estimation include estimation of occupational fatigue
and psychology, and marketing and advertisement. In this work, we propose PsyMo
(Psychological traits from Motion), a novel, multi-purpose and multi-modal
dataset for exploring psychological cues manifested in walking patterns. We
gathered walking sequences from 312 subjects in 7 different walking variations
and 6 camera angles. In conjunction with walking sequences, participants filled
in 6 psychological questionnaires, totalling 17 psychometric attributes related
to personality, self-esteem, fatigue, aggressiveness and mental health. We
propose two evaluation protocols for psychological trait estimation. Alongside
the estimation of self-reported psychological traits from gait, the dataset can
be used as a drop-in replacement to benchmark methods for gait recognition. We
anonymize all cues related to the identity of the subjects and publicly release
only silhouettes, 2D / 3D human skeletons and 3D SMPL human meshes
GaitPT: Skeletons Are All You Need For Gait Recognition
The analysis of patterns of walking is an important area of research that has
numerous applications in security, healthcare, sports and human-computer
interaction. Lately, walking patterns have been regarded as a unique
fingerprinting method for automatic person identification at a distance. In
this work, we propose a novel gait recognition architecture called Gait Pyramid
Transformer (GaitPT) that leverages pose estimation skeletons to capture unique
walking patterns, without relying on appearance information. GaitPT adopts a
hierarchical transformer architecture that effectively extracts both spatial
and temporal features of movement in an anatomically consistent manner, guided
by the structure of the human skeleton. Our results show that GaitPT achieves
state-of-the-art performance compared to other skeleton-based gait recognition
works, in both controlled and in-the-wild scenarios. GaitPT obtains 82.6%
average accuracy on CASIA-B, surpassing other works by a margin of 6%.
Moreover, it obtains 52.16% Rank-1 accuracy on GREW, outperforming both
skeleton-based and appearance-based approaches
RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian
Recently, large language models (LLMs) have become increasingly powerful and
have become capable of solving a plethora of tasks through proper instructions
in natural language. However, the vast majority of testing suites assume that
the instructions are written in English, the de facto prompting language. Code
intelligence and problem solving still remain a difficult task, even for the
most advanced LLMs. Currently, there are no datasets to measure the
generalization power for code-generation models in a language other than
English. In this work, we present RoCode, a competitive programming dataset,
consisting of 2,642 problems written in Romanian, 11k solutions in C, C++ and
Python and comprehensive testing suites for each problem. The purpose of RoCode
is to provide a benchmark for evaluating the code intelligence of language
models trained on Romanian / multilingual text as well as a fine-tuning set for
pretrained Romanian models. Through our results and review of related works, we
argue for the need to develop code models for languages other than English.Comment: Accepted at LREC-COLING 202
CamLoc: Pedestrian Location Estimation through Body Pose Estimation on Smart Cameras
Advances in hardware and algorithms are driving the exponential growth of Internet of Things (IoT), with increasingly more pervasive computations being performed near the data generation sources. With this wave of technology, a range of intelligent devices can perform local inferences (activity recognition, fitness monitoring, etc.), which have obvious advantages: reduced inference latency for interactive (real-time) applications and better data privacy by processing user data locally. Video processing can benefit many applications and data labelling systems, although performing this efficiently at the edge of the Internet is not trivial. In this paper, we show that accurate pedestrian location estimation is achievable using deep neural networks on fixed cameras with limited computing resources. Our approach, CamLoc, uses pose estimation from key body points detection to extend pedestrian skeleton when the entire body is not in view (occluded by obstacles or partially outside the frame). Our evaluation dataset contains over 2100 frames from surveillance cameras (including two cameras simultaneously pointing at the same scene from different angles), in 42 different scenarios of activity and occlusion. We make this dataset available together with annotations indicating the exact 2D position of person in frame as ground-truth information. CamLoc achieves good location estimation accuracy in these complex scenarios with high levels of occlusion, matching the performance of state-of-theart solutions, but using less computing resources and attaining a higher inference throughput
Reading Between the Frames: Multi-Modal Depression Detection in Videos from Non-Verbal Cues
Depression, a prominent contributor to global disability, affects a
substantial portion of the population. Efforts to detect depression from social
media texts have been prevalent, yet only a few works explored depression
detection from user-generated video content. In this work, we address this
research gap by proposing a simple and flexible multi-modal temporal model
capable of discerning non-verbal depression cues from diverse modalities in
noisy, real-world videos. We show that, for in-the-wild videos, using
additional high-level non-verbal cues is crucial to achieving good performance,
and we extracted and processed audio speech embeddings, face emotion
embeddings, face, body and hand landmarks, and gaze and blinking information.
Through extensive experiments, we show that our model achieves state-of-the-art
results on three key benchmark datasets for depression detection from video by
a substantial margin. Our code is publicly available on GitHub.Comment: Accepted at 46th European Conference on Information Retrieval (ECIR
2024
MICROSTRUCTURAL ANALYSIS OF THE INTERFACE BETWEEN SOME SUPERALLOYS AND COMPOSITE/CERAMIC MATERIALS
The clinical success of aesthetic ceramic fused to metal or composite resin bonded to metal restorations depends on the quality and strength of composite/ceramic bonding. To investigate the ceramic and composite surface adhesion to the surface of the alloys, samples were prepared by using the metallographic techniques and then were analyzed by Scanning Electron Microscopy (SEM). We studied a total of four samples of superalloys, denoted S1, S2, S3, and S4. Each of these was treated with: Vita ceramic powders, Noritake ceramic powders, Premise Indirect composite and an indigenous composite C1. At a magnification level of x1500, the adherence between the layers and the surface irregularities of the layers that improve the adherence could be properly observed. It is worth noting that after the sample preparation procedure, samples S1, S2 and S4 were damaged, the only sample remaining in a good condition was sample S3
- …
