Search CORE

886 research outputs found

PsyMo: A Dataset for Estimating Self-Reported Psychological Traits from Gait

Author: Cosma Adrian
Radoi Emilian
Publication venue
Publication date: 21/08/2023
Field of study

Psychological trait estimation from external factors such as movement and appearance is a challenging and long-standing problem in psychology, and is principally based on the psychological theory of embodiment. To date, attempts to tackle this problem have utilized private small-scale datasets with intrusive body-attached sensors. Potential applications of an automated system for psychological trait estimation include estimation of occupational fatigue and psychology, and marketing and advertisement. In this work, we propose PsyMo (Psychological traits from Motion), a novel, multi-purpose and multi-modal dataset for exploring psychological cues manifested in walking patterns. We gathered walking sequences from 312 subjects in 7 different walking variations and 6 camera angles. In conjunction with walking sequences, participants filled in 6 psychological questionnaires, totalling 17 psychometric attributes related to personality, self-esteem, fatigue, aggressiveness and mental health. We propose two evaluation protocols for psychological trait estimation. Alongside the estimation of self-reported psychological traits from gait, the dataset can be used as a drop-in replacement to benchmark methods for gait recognition. We anonymize all cues related to the identity of the subjects and publicly release only silhouettes, 2D / 3D human skeletons and 3D SMPL human meshes

arXiv.org e-Print Archive

Expanded Parts Model for Semantic Description of Humans in Still Images

Author: Jurie Frederic
Schmid Cordelia
Sharma Gaurav
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2016
Field of study

We introduce an Expanded Parts Model (EPM) for recognizing human attributes (e.g. young, short hair, wearing suit) and actions (e.g. running, jumping) in still images. An EPM is a collection of part templates which are learnt discriminatively to explain specific scale-space regions in the images (in human centric coordinates). This is in contrast to current models which consist of a relatively few (i.e. a mixture of) 'average' templates. EPM uses only a subset of the parts to score an image and scores the image sparsely in space, i.e. it ignores redundant and random background in an image. To learn our model, we propose an algorithm which automatically mines parts and learns corresponding discriminative templates together with their respective locations from a large number of candidate parts. We validate our method on three recent challenging datasets of human attributes and actions. We obtain convincing qualitative and state-of-the-art quantitative results on the three datasets.Comment: Accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI

arXiv.org e-Print Archive

HAL - Normandie Université

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

MPG.PuRe

Videoprompter: an ensemble of foundational models for zero-shot video understanding

Author: Khan Fahad Shahbaz
Khan Salman
Naseer Muzammal
Shah Mubarak
Yousaf Adeel
Publication venue
Publication date: 23/10/2023
Field of study

Vision-language models (VLMs) classify the query video by calculating a similarity score between the visual features and text-based class label representations. Recently, large language models (LLMs) have been used to enrich the text-based class labels by enhancing the descriptiveness of the class names. However, these improvements are restricted to the text-based classifier only, and the query visual features are not considered. In this paper, we propose a framework which combines pre-trained discriminative VLMs with pre-trained generative video-to-text and text-to-text models. We introduce two key modifications to the standard zero-shot setting. First, we propose language-guided visual feature enhancement and employ a video-to-text model to convert the query video to its descriptive form. The resulting descriptions contain vital visual cues of the query video, such as what objects are present and their spatio-temporal interactions. These descriptive cues provide additional semantic knowledge to VLMs to enhance their zeroshot performance. Second, we propose video-specific prompts to LLMs to generate more meaningful descriptions to enrich class label representations. Specifically, we introduce prompt techniques to create a Tree Hierarchy of Categories for class names, offering a higher-level action context for additional visual cues, We demonstrate the effectiveness of our approach in video understanding across three different zero-shot settings: 1) video action recognition, 2) video-to-text and textto-video retrieval, and 3) time-sensitive video tasks. Consistent improvements across multiple benchmarks and with various VLMs demonstrate the effectiveness of our proposed framework. Our code will be made publicly available

arXiv.org e-Print Archive

An Adaptive Human Activity-Aided Hand-Held Smartphone-Based Pedestrian Dead Reckoning Positioning System

Author: Ma C
Poslad S
Selviah DR
Wu B
Publication venue: 'MDPI AG'
Publication date: 01/06/2021
Field of study

Pedestrian dead reckoning (PDR), enabled by smartphones’ embedded inertial sensors, is widely applied as a type of indoor positioning system (IPS). However, traditional PDR faces two challenges to improve its accuracy: lack of robustness for different PDR-related human activities and positioning error accumulation over elapsed time. To cope with these issues, we propose a novel adaptive human activity-aided PDR (HAA-PDR) IPS that consists of two main parts, human activity recognition (HAR) and PDR optimization. (1) For HAR, eight different locomotion-related activities are divided into two classes: steady-heading activities (ascending/descending stairs, stationary, normal walking, stationary stepping, and lateral walking) and non-steady-heading activities (door opening and turning). A hierarchical combination of a support vector machine (SVM) and decision tree (DT) is used to recognize steady-heading activities. An autoencoder-based deep neural network (DNN) and a heading range-based method to recognize door opening and turning, respectively. The overall HAR accuracy is over 98.44%. (2) For optimization methods, a process automatically sets the parameters of the PDR differently for different activities to enhance step counting and step length estimation. Furthermore, a method of trajectory optimization mitigates PDR error accumulation utilizing the non-steady-heading activities. We divided the trajectory into small segments and reconstructed it after targeted optimization of each segment. Our method does not use any a priori knowledge of the building layout, plan, or map. Finally, the mean positioning error of our HAA-PDR in a multilevel building is 1.79 m, which is a significant improvement in accuracy compared with a baseline state-of-the-art PDR system

UCL Discovery

Sensor Signal Processing for Human Gait Deterioration Analysis by Machine Learning

Author: Alharthi Abdullah
Publication venue
Publication date: 01/08/2022
Field of study

The University of Manchester - Institutional Repository

Around-Body Interaction: Leveraging Limb Movements for Interacting in a Digitally Augmented Physical World

Author: Müller Florian
Publication venue
Publication date: 28/03/2023
Field of study

Recent technological advances have made head-mounted displays (HMDs) smaller and untethered, fostering the vision of ubiquitous interaction with information in a digitally augmented physical world. For interacting with such devices, three main types of input - besides not very intuitive finger gestures - have emerged so far: 1) Touch input on the frame of the devices or 2) on accessories (controller) as well as 3) voice input. While these techniques have both advantages and disadvantages depending on the current situation of the user, they largely ignore the skills and dexterity that we show when interacting with the real world: Throughout our lives, we have trained extensively to use our limbs to interact with and manipulate the physical world around us. This thesis explores how the skills and dexterity of our upper and lower limbs, acquired and trained in interacting with the real world, can be transferred to the interaction with HMDs. Thus, this thesis develops the vision of around-body interaction, in which we use the space around our body, defined by the reach of our limbs, for fast, accurate, and enjoyable interaction with such devices. This work contributes four interaction techniques, two for the upper limbs and two for the lower limbs: The first contribution shows how the proximity between our head and hand can be used to interact with HMDs. The second contribution extends the interaction with the upper limbs to multiple users and illustrates how the registration of augmented information in the real world can support cooperative use cases. The third contribution shifts the focus to the lower limbs and discusses how foot taps can be leveraged as an input modality for HMDs. The fourth contribution presents how lateral shifts of the walking path can be exploited for mobile and hands-free interaction with HMDs while walking.Comment: thesi

arXiv.org e-Print Archive

Proceedings of the 11th international conference on disability, virtual reality and associated technologies (ICDVRAT 2016)

Author
Publication venue: University of Reading
Publication date: 19/09/2016
Field of study

Central Archive at the University of Reading

Mobility increases localizability: A survey on wireless indoor localization using inertial sensors

Author: LIY Yunhao
WANG Xu
WU Chenshu
YANG Zheng
ZHANG Xinglin
ZHOU Zimu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2015
Field of study

Institutional Knowledge at Singapore Management University