Search CORE

11,091 research outputs found

Recommended from our members

Reachable Workspace and Proximal Function Measures for Quantifying Upper Limb Motion.

Author: Bajcsy Ruzena
Cheng Louis
Han Jay J
Kurillo Gregorij
Lotz Jeffrey
Matthew Robert P
Seko Sarah
Publication venue: eScholarship, University of California
Publication date: 01/11/2020
Field of study

There are a lack of quantitative measures for clinically assessing upper limb function. Conventional biomechanical performance measures are restricted to specialist labs due to hardware cost and complexity, while the resulting measurements require specialists for analysis. Depth cameras are low cost and portable systems that can track surrogate joint positions. However, these motions may not be biologically consistent, which can result in noisy, inaccurate movements. This paper introduces a rigid body modelling method to enforce biological feasibility of the recovered motions. This method is evaluated on an existing depth camera assessment: the reachable workspace (RW) measure for assessing gross shoulder function. As a rigid body model is used, position estimates of new proximal targets can be added, resulting in a proximal function (PF) measure for assessing a subject's ability to touch specific body landmarks. The accuracy, and repeatability of these measures is assessed on ten asymptomatic subjects, with and without rigid body constraints. This analysis is performed both on a low-cost depth camera system and a gold-standard active motion capture system. The addition of rigid body constraints was found to improve accuracy and concordance of the depth camera system, particularly in lateral reaching movements. Both RW and PF measures were found to be feasible candidates for clinical assessment, with future analysis needed to determine their ability to detect changes within specific patient populations

eScholarship - University of California

The Visual Social Distancing Problem

Author: Cristani Marco
Del Bue Alessio
Murino Vittorio
Setti Francesco
Vinciarelli Alessandro
Publication venue
Publication date: 01/01/2020
Field of study

One of the main and most effective measures to contain the recent viral outbreak is the maintenance of the so-called Social Distancing (SD). To comply with this constraint, workplaces, public institutions, transports and schools will likely adopt restrictions over the minimum inter-personal distance between people. Given this actual scenario, it is crucial to massively measure the compliance to such physical constraint in our life, in order to figure out the reasons of the possible breaks of such distance limitations, and understand if this implies a possible threat given the scene context. All of this, complying with privacy policies and making the measurement acceptable. To this end, we introduce the Visual Social Distancing (VSD) problem, defined as the automatic estimation of the inter-personal distance from an image, and the characterization of the related people aggregations. VSD is pivotal for a non-invasive analysis to whether people comply with the SD restriction, and to provide statistics about the level of safety of specific areas whenever this constraint is violated. We then discuss how VSD relates with previous literature in Social Signal Processing and indicate which existing Computer Vision methods can be used to manage such problem. We conclude with future challenges related to the effectiveness of VSD systems, ethical implications and future application scenarios.Comment: 9 pages, 5 figures. All the authors equally contributed to this manuscript and they are listed by alphabetical order. Under submissio

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

Enlighten

CaloriNet: From silhouettes to calorie estimation in private environments

Author: Burghardt Tilo
Damen Dima
Hannuna Sion
Masullo Alessandro
Mirmehdi Majid
Ponce-López Victor
Publication venue
Publication date: 21/06/2018
Field of study

We propose a novel deep fusion architecture, CaloriNet, for the online estimation of energy expenditure for free living monitoring in private environments, where RGB data is discarded and replaced by silhouettes. Our fused convolutional neural network architecture is trainable end-to-end, to estimate calorie expenditure, using temporal foreground silhouettes alongside accelerometer data. The network is trained and cross-validated on a publicly available dataset, SPHERE_RGBD + Inertial_calorie. Results show state-of-the-art minimum error on the estimation of energy expenditure (calories per minute), outperforming alternative, standard and single-modal techniques.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

UCL Discovery

Explore Bristol Research

Mining Mid-level Features for Action Recognition Based on Effective Skeleton Representation

Author: Gao Zhimin
Li Wanqing
Ogunbona Philip
Wang Pichao
Zhang Hanling
Publication venue
Publication date: 01/01/2014
Field of study

Recently, mid-level features have shown promising performance in computer vision. Mid-level features learned by incorporating class-level information are potentially more discriminative than traditional low-level local features. In this paper, an effective method is proposed to extract mid-level features from Kinect skeletons for 3D human action recognition. Firstly, the orientations of limbs connected by two skeleton joints are computed and each orientation is encoded into one of the 27 states indicating the spatial relationship of the joints. Secondly, limbs are combined into parts and the limb's states are mapped into part states. Finally, frequent pattern mining is employed to mine the most frequent and relevant (discriminative, representative and non-redundant) states of parts in continuous several frames. These parts are referred to as Frequent Local Parts or FLPs. The FLPs allow us to build powerful bag-of-FLP-based action representation. This new representation yields state-of-the-art results on MSR DailyActivity3D and MSR ActionPairs3D

arXiv.org e-Print Archive

Crossref

Research Online

동영상 속 사람 동작의 물리 기반 재구성 및 분석

Author: 유리
Publication venue: 서울대학교 대학원
Publication date: 01/02/2021
Field of study

학위논문 (박사) -- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2021. 2. 이제희.In computer graphics, simulating and analyzing human movement have been interesting research topics started since the 1960s. Still, simulating realistic human movements in a 3D virtual world is a challenging task in computer graphics. In general, motion capture techniques have been used. Although the motion capture data guarantees realistic result and high-quality data, there is lots of equipment required to capture motion, and the process is complicated. Recently, 3D human pose estimation techniques from the 2D video are remarkably developed. Researchers in computer graphics and computer vision have attempted to reconstruct the various human motions from video data. However, existing methods can not robustly estimate dynamic actions and not work on videos filmed with a moving camera. In this thesis, we propose methods to reconstruct dynamic human motions from in-the-wild videos and to control the motions. First, we developed a framework to reconstruct motion from videos using prior physics knowledge. For dynamic motions such as backspin, the poses estimated by a state-of-the-art method are incomplete and include unreliable root trajectory or lack intermediate poses. We designed a reward function using poses and hints extracted from videos in the deep reinforcement learning controller and learned a policy to simultaneously reconstruct motion and control a virtual character. Second, we simulated figure skating movements in video. Skating sequences consist of fast and dynamic movements on ice, hindering the acquisition of motion data. Thus, we extracted 3D key poses from a video to then successfully replicate several figure skating movements using trajectory optimization and a deep reinforcement learning controller. Third, we devised an algorithm for gait analysis through video of patients with movement disorders. After acquiring the patients joint positions from 2D video processed by a deep learning network, the 3D absolute coordinates were estimated, and gait parameters such as gait velocity, cadence, and step length were calculated. Additionally, we analyzed the optimization criteria of human walking by using a 3D musculoskeletal humanoid model and physics-based simulation. For two criteria, namely, the minimization of muscle activation and joint torque, we compared simulation data with real human data for analysis. To demonstrate the effectiveness of the first two research topics, we verified the reconstruction of dynamic human motions from 2D videos using physics-based simulations. For the last two research topics, we evaluated our results with real human data.컴퓨터 그래픽스에서 인간의 움직임 시뮬레이션 및 분석은 1960 년대부터 다루어진 흥미로운 연구 주제이다. 몇 십년 동안 활발하게 연구되어 왔음에도 불구하고, 3차원 가상 공간 상에서 사실적인 인간의 움직임을 시뮬레이션하는 연구는 여전히 어렵고 도전적인 주제이다. 그동안 사람의 움직임 데이터를 얻기 위해서 모션 캡쳐 기술이 사용되어 왔다. 모션 캡처 데이터는 사실적인 결과와 고품질 데이터를 보장하지만 모션 캡쳐를 하기 위해서 필요한 장비들이 많고, 그 과정이 복잡하다. 최근에 2차원 영상으로부터 사람의 3차원 자세를 추정하는 연구들이 괄목할 만한 결과를 보여주고 있다. 이를 바탕으로 컴퓨터 그래픽스와 컴퓨터 비젼 분야의 연구자들은 비디오 데이터로부터 다양한 인간 동작을 재구성하려는 시도를 하고 있다. 그러나 기존의 방법들은 빠르고 다이나믹한 동작들은 안정적으로 추정하지 못하며 움직이는 카메라로 촬영한 비디오에 대해서는 작동하지 않는다. 본 논문에서는 비디오로부터 역동적인 인간 동작을 재구성하고 동작을 제어하는 방법을 제안한다. 먼저 사전 물리학 지식을 사용하여 비디오에서 모션을 재구성하는 프레임 워크를 제안한다. 공중제비와 같은 역동적인 동작들에 대해서 최신 연구 방법을 동원하여 추정된 자세들은 캐릭터의 궤적을 신뢰할 수 없거나 중간에 자세 추정에 실패하는 등 불완전하다. 우리는 심층강화학습 제어기에서 영상으로부터 추출한 포즈와 힌트를 활용하여 보상 함수를 설계하고 모션 재구성과 캐릭터 제어를 동시에 하는 정책을 학습하였다. 둘 째, 비디오에서 피겨 스케이팅 기술을 시뮬레이션한다. 피겨 스케이팅 기술들은 빙상에서 빠르고 역동적인 움직임으로 구성되어 있어 모션 데이터를 얻기가 까다롭다. 비디오에서 3차원 키 포즈를 추출하고 궤적 최적화 및 심층강화학습 제어기를 사용하여 여러 피겨 스케이팅 기술을 성공적으로 시연한다. 셋 째, 파킨슨 병이나 뇌성마비와 같은 질병으로 인하여 움직임 장애가 있는 환자의 보행을 분석하기 위한 알고리즘을 제안한다. 2차원 비디오로부터 딥러닝을 사용한 자세 추정기법을 사용하여 환자의 관절 위치를 얻어낸 다음, 3차원 절대 좌표를 얻어내어 이로부터 보폭, 보행 속도와 같은 보행 파라미터를 계산한다. 마지막으로, 근골격 인체 모델과 물리 시뮬레이션을 이용하여 인간 보행의 최적화 기준에 대해 탐구한다. 근육 활성도 최소화와 관절 돌림힘 최소화, 두 가지 기준에 대해 시뮬레이션한 후, 실제 사람 데이터와 비교하여 결과를 분석한다. 처음 두 개의 연구 주제의 효과를 입증하기 위해, 물리 시뮬레이션을 사용하여 이차원 비디오로부터 재구성한 여러 가지 역동적인 사람의 동작들을 재현한다. 나중 두 개의 연구 주제는 사람 데이터와의 비교 분석을 통하여 평가한다.1 Introduction 1 2 Background 9 2.1 Pose Estimation from 2D Video . . . . . . . . . . . . . . . . . . . . 9 2.2 Motion Reconstruction from Monocular Video . . . . . . . . . . . . 10 2.3 Physics-Based Character Simulation and Control . . . . . . . . . . . 12 2.4 Motion Reconstruction Leveraging Physics . . . . . . . . . . . . . . 13 2.5 Human Motion Control . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5.1 Figure Skating Simulation . . . . . . . . . . . . . . . . . . . 16 2.6 Objective Gait Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.7 Optimization for Human Movement Simulation . . . . . . . . . . . . 17 2.7.1 Stability Criteria . . . . . . . . . . . . . . . . . . . . . . . . 18 3 Human Dynamics from Monocular Video with Dynamic Camera Movements 19 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3 Pose and Contact Estimation . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Learning Human Dynamics . . . . . . . . . . . . . . . . . . . . . . . 24 3.4.1 Policy Learning . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4.2 Network Training . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.3 Scene Estimator . . . . . . . . . . . . . . . . . . . . . . . . 29 3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5.1 Video Clips . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5.2 Comparison of Contact Estimators . . . . . . . . . . . . . . . 33 3.5.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.4 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4 Figure Skating Simulation from Video 42 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3 Skating Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.1 Non-holonomic Constraints . . . . . . . . . . . . . . . . . . 46 4.3.2 Relaxation of Non-holonomic Constraints . . . . . . . . . . . 47 4.4 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.5 Trajectory Optimization and Control . . . . . . . . . . . . . . . . . . 50 4.5.1 Trajectory Optimization . . . . . . . . . . . . . . . . . . . . 50 4.5.2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5 Gait Analysis Using Pose Estimation Algorithm with 2D-video of Patients 61 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.2.1 Patients and video recording . . . . . . . . . . . . . . . . . . 63 5.2.2 Standard protocol approvals, registrations, and patient consents 66 5.2.3 3D Pose estimation from 2D video . . . . . . . . . . . . . . . 66 5.2.4 Gait parameter estimation . . . . . . . . . . . . . . . . . . . 67 5.2.5 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . 68 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.3.1 Validation of video-based analysis of the gait . . . . . . . . . 68 5.3.2 gait analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.4.1 Validation with the conventional sensor-based method . . . . 75 5.4.2 Analysis of gait and turning in TUG . . . . . . . . . . . . . . 75 5.4.3 Correlation with clinical parameters . . . . . . . . . . . . . . 76 5.4.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.5 Supplementary Material . . . . . . . . . . . . . . . . . . . . . . . . . 77 6 Control Optimization of Human Walking 80 6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 6.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 6.2.1 Musculoskeletal model . . . . . . . . . . . . . . . . . . . . . 82 6.2.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 82 6.2.3 Control co-activation level . . . . . . . . . . . . . . . . . . . 83 6.2.4 Push-recovery experiment . . . . . . . . . . . . . . . . . . . 84 6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 7 Conclusion 90 7.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Docto

SNU Open Repository and Archive

Robust 3D Action Recognition through Sampling Local Appearances and Global Distributions

Author: Chen Chen
Liu Hong
Liu Mengyuan
Publication venue
Publication date: 07/12/2017
Field of study

3D action recognition has broad applications in human-computer interaction and intelligent surveillance. However, recognizing similar actions remains challenging since previous literature fails to capture motion and shape cues effectively from noisy depth data. In this paper, we propose a novel two-layer Bag-of-Visual-Words (BoVW) model, which suppresses the noise disturbances and jointly encodes both motion and shape cues. First, background clutter is removed by a background modeling method that is designed for depth data. Then, motion and shape cues are jointly used to generate robust and distinctive spatial-temporal interest points (STIPs): motion-based STIPs and shape-based STIPs. In the first layer of our model, a multi-scale 3D local steering kernel (M3DLSK) descriptor is proposed to describe local appearances of cuboids around motion-based STIPs. In the second layer, a spatial-temporal vector (STV) descriptor is proposed to describe the spatial-temporal distributions of shape-based STIPs. Using the Bag-of-Visual-Words (BoVW) model, motion and shape cues are combined to form a fused action representation. Our model performs favorably compared with common STIP detection and description methods. Thorough experiments verify that our model is effective in distinguishing similar actions and robust to background clutter, partial occlusions and pepper noise

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Real-time Estimation of Physical Activity Intensity for Daily Living

Author: Aldamen Dima
Burghardt Tilo
Camplani Massimo
Cooper Ashley
Craddock Ian
Hannuna Sion
Mirmehdi Majid
Paiement Adeline
Tao Lili
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2016
Field of study

Crossref

Explore Bristol Research

Study Of Human Activity In Video Data With An Emphasis On View-invariance

Author: Ashraf Nazim
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2012
Field of study

The perception and understanding of human motion and action is an important area of research in computer vision that plays a crucial role in various applications such as surveillance, HCI, ergonomics, etc. In this thesis, we focus on the recognition of actions in the case of varying viewpoints and different and unknown camera intrinsic parameters. The challenges to be addressed include perspective distortions, differences in viewpoints, anthropometric variations, and the large degrees of freedom of articulated bodies. In addition, we are interested in methods that require little or no training. The current solutions to action recognition usually assume that there is a huge dataset of actions available so that a classifier can be trained. However, this means that in order to define a new action, the user has to record a number of videos from different viewpoints with varying camera intrinsic parameters and then retrain the classifier, which is not very practical from a development point of view. We propose algorithms that overcome these challenges and require just a few instances of the action from any viewpoint with any intrinsic camera parameters. Our first algorithm is based on the rank constraint on the family of planar homographies associated with triplets of body points. We represent action as a sequence of poses, and decompose the pose into triplets. Therefore, the pose transition is broken down into a set of movement of body point planes. In this way, we transform the non-rigid motion of the body points into a rigid motion of body point iii planes. We use the fact that the family of homographies associated with two identical poses would have rank 4 to gauge similarity of the pose between two subjects, observed by different perspective cameras and from different viewpoints. This method requires only one instance of the action. We then show that it is possible to extend the concept of triplets to line segments. In particular, we establish that if we look at the movement of line segments instead of triplets, we have more redundancy in data thus leading to better results. We demonstrate this concept on “fundamental ratios.” We decompose a human body pose into line segments instead of triplets and look at set of movement of line segments. This method needs only three instances of the action. If a larger dataset is available, we can also apply weighting on line segments for better accuracy. The last method is based on the concept of “Projective Depth”. Given a plane, we can find the relative depth of a point relative to the given plane. We propose three different ways of using “projective depth:” (i) Triplets - the three points of a triplet along with the epipole defines the plane and the movement of points relative to these body planes can be used to recognize actions; (ii) Ground plane - if we are able to extract the ground plane, we can find the “projective depth” of the body points with respect to it. Therefore, the problem of action recognition would translate to curve matching; and (iii) Mirror person - We can use the mirror view of the person to extract mirror symmetric planes. This method also needs only one instance of the action. Extensive experiments are reported on testing view invariance, robustness to noisy localization and occlusions of body points, and action recognition. The experimental results are very promising and demonstrate the efficiency of our proposed invariants. i

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

View-invariant human movement assessment

Author: Sardari Faegheh
Publication venue
Publication date: 21/06/2022
Field of study

Explore Bristol Research

Using the Microsoft Kinect to assess human bimanual coordination

Author: Liddy Joshua James
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2014
Field of study

Optical marker-based systems are the gold-standard for capturing three-dimensional (3D) human kinematics. However, these systems have various drawbacks including time consuming marker placement, soft tissue movement artifact, and are prohibitively expensive and non-portable. The Microsoft Kinect is an inexpensive, portable, depth camera that can be used to capture 3D human movement kinematics. Numerous investigations have assessed the Kinect\u27s ability to capture postural control and gait, but to date, no study has evaluated it\u27s capabilities for measuring spatiotemporal coordination. In order to investigate human coordination and coordination stability with the Kinect, a well-studied bimanual coordination paradigm (Kelso, 1984, Kelso; Scholz, & Schöner, 1986) was adapted. ^ Nineteen participants performed ten trials of coordinated hand movements in either in-phase or anti-phase patterns of coordination to the beat of a metronome which was incrementally sped up and slowed down. Continuous relative phase (CRP) and the standard deviation of CRP were used to assess coordination and coordination stability, respectively.^ Data from the Kinect were compared to a Vicon motion capture system using a mixed-model, repeated measures analysis of variance and intraclass correlation coefficients (2,1) (ICC(2,1)).^ Kinect significantly underestimated CRP for the the anti-phase coordination pattern (p \u3c.0001) and overestimated the in-phase pattern (p\u3c.0001). However, a high ICC value (r=.097) was found between the systems. For the standard deviation of CRP, the Kinect exhibited significantly higher variability than the Vicon (p \u3c .0001) but was able to distinguish significant differences between patterns of coordination with anti-phase variability being higher than in-phase (p \u3c .0001). Additionally, the Kinect was unable to accurately capture the structure of coordination stability for the anti-phase pattern. Finally, agreement was found between systems using the ICC (r=.37).^ In conclusion, the Kinect was unable to accurately capture mean CRP. However, the high ICC between the two systems is promising and the Kinect was able to distinguish between the coordination stability of in-phase and anti-phase coordination. However, the structure of variability as movement speed increased was dissimilar to the Vicon, particularly for the anti-phase pattern. Some aspects of coordination are nicely captured by the Kinect while others are not. Detecting differences between bimanual coordination patterns and the stability of those patterns can be achieved using the Kinect. However, researchers interested in the structure of coordination stability should exercise caution since poor agreement was found between systems

Purdue E-Pubs