Search CORE

268 research outputs found

Deep Learning-Based Human Pose Estimation: A Survey

Author: Chen Chen
Kehtarnavaz Nasser
Liu Ruixu
Shah Mubarak
Shen Ju
Wu Wenhan
Yang Taojiannan
Zheng Ce
Zhu Sijie
Publication venue
Publication date: 02/01/2021
Field of study

Human pose estimation aims to locate the human body parts and build human body representation (e.g., body skeleton) from input data such as images and videos. It has drawn increasing attention during the past decade and has been utilized in a wide range of applications including human-computer interaction, motion analysis, augmented reality, and virtual reality. Although the recently developed deep learning-based solutions have achieved high performance in human pose estimation, there still remain challenges due to insufficient training data, depth ambiguities, and occlusion. The goal of this survey paper is to provide a comprehensive review of recent deep learning-based solutions for both 2D and 3D pose estimation via a systematic analysis and comparison of these solutions based on their input data and inference procedures. More than 240 research papers since 2014 are covered in this survey. Furthermore, 2D and 3D human pose estimation datasets and evaluation metrics are included. Quantitative performance comparisons of the reviewed methods on popular datasets are summarized and discussed. Finally, the challenges involved, applications, and future research directions are concluded. We also provide a regularly updated project page: \url{https://github.com/zczcwh/DL-HPE

arXiv.org e-Print Archive

Crowd detection and counting using a static and dynamic platform: state of the art

Author: Chaudhry Huma
Rahim MSM
Rehman A
Saba T
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2019
Field of study

Automated object detection and crowd density estimation are popular and important area in visual surveillance research. The last decades witnessed many significant research in this field however, it is still a challenging problem for automatic visual surveillance. The ever increase in research of the field of crowd dynamics and crowd motion necessitates a detailed and updated survey of different techniques and trends in this field. This paper presents a survey on crowd detection and crowd density estimation from moving platform and surveys the different methods employed for this purpose. This review category and delineates several detections and counting estimation methods that have been applied for the examination of scenes from static and moving platforms

Crossref

Victoria University Eprints Repository

Universiti Teknologi Malaysia Institutional Repository

Subgraphs Matching-Based Side Information Generation for Distributed Multiview Video Coding

Author: Hongkai Xiong
Hui Lv
Li Song
Tsuhan Chen
Yongsheng Zhang
Zhihai He
Publication venue: Springer Nature
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

Real-Time Human Pose Estimation on a Smart Walker using Convolutional Neural Networks

Author: Frontoni Emanuele
Migliorelli Lucia
Moccia Sara
Palermo Manuel
Santos Cristina P.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Rehabilitation is important to improve quality of life for mobility-impaired patients. Smart walkers are a commonly used solution that should embed automatic and objective tools for data-driven human-in-the-loop control and monitoring. However, present solutions focus on extracting few specific metrics from dedicated sensors with no unified full-body approach. We investigate a general, real-time, full-body pose estimation framework based on two RGB+D camera streams with non-overlapping views mounted on a smart walker equipment used in rehabilitation. Human keypoint estimation is performed using a two-stage neural network framework. The 2D-Stage implements a detection module that locates body keypoints in the 2D image frames. The 3D-Stage implements a regression module that lifts and relates the detected keypoints in both cameras to the 3D space relative to the walker. Model predictions are low-pass filtered to improve temporal consistency. A custom acquisition method was used to obtain a dataset, with 14 healthy subjects, used for training and evaluating the proposed framework offline, which was then deployed on the real walker equipment. An overall keypoint detection error of 3.73 pixels for the 2D-Stage and 44.05mm for the 3D-Stage were reported, with an inference time of 26.6ms when deployed on the constrained hardware of the walker. We present a novel approach to patient monitoring and data-driven human-in-the-loop control in the context of smart walkers. It is able to extract a complete and compact body representation in real-time and from inexpensive sensors, serving as a common base for downstream metrics extraction solutions, and Human-Robot interaction applications. Despite promising results, more data should be collected on users with impairments, to assess its performance as a rehabilitation tool in real-world scenarios.Comment: Accepted for publication in Expert Systems with Application

arXiv.org e-Print Archive

Archivio della ricerca della Scuola Superiore Sant'Anna

Cable Tension Monitoring using Non-Contact Vision-based Techniques

Author: Chu Chaoyang
Publication venue: 'University of Windsor Leddy Library'
Publication date: 07/07/2020
Field of study

In cable-stayed bridges, the structural systems of tensioned cables play a critical role in structural and functional integrity. Thereby, tensile forces in the cables become one of the essential indicators in structural health monitoring (SHM). In this thesis, a video image processing technology integrated with cable dynamic analysis is proposed as a non-contact vision-based measurement technique, which provides a user-friendly, cost-effective, and computationally efficient solution to displacement extraction, frequency identification, and cable tension monitoring. In contrast to conventional contact sensors, the vision-based system is capable of taking remote measurements of cable dynamic response while having flexible sensing capability. Since cable detection is a substantial step in displacement extraction, a comprehensive study on the feasibility of the adopted feature detector is conducted under various testing scenarios. The performance of the feature detector is quantified by developing evaluation parameters. Enhancement methods for the feature detector in cable detection are investigated as well under complex testing environments. Threshold-dependent image matching approaches, which optimize the functionality of the feature-based video image processing technology, is proposed for noise-free and noisy background scenarios. The vision-based system is validated through experimental studies of free vibration tests on a single undamped cable in laboratory settings. The maximum percentage difference of the identified cable fundamental frequency is found to be 0.74% compared with accelerometer readings, while the maximum percentage difference of the estimated cable tensile force is 4.64% compared to direct measurement by a load cell

Scholarship at UWindsor

Large-scale interactive exploratory visual search

Author: Lu Shiyang
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2014
Field of study

Large scale visual search has been one of the challenging issues in the era of big data. It demands techniques that are not only highly effective and efficient but also allow users conveniently express their information needs and refine their intents. In this thesis, we focus on developing an exploratory framework for large scale visual search. We also develop a number of enabling techniques in this thesis, including compact visual content representation for scalable search, near duplicate video shot detection, and action based event detection. We propose a novel scheme for extremely low bit rate visual search, which sends compressed visual words consisting of vocabulary tree histogram and descriptor orientations rather than descriptors. Compact representation of video data is achieved through identifying keyframes of a video which can also help users comprehend visual content efficiently. We propose a novel Bag-of-Importance model for static video summarization. Near duplicate detection is one of the key issues for large scale visual search, since there exist a large number nearly identical images and videos. We propose an improved near-duplicate video shot detection approach for more effective shot representation. Event detection has been one of the solutions for bridging the semantic gap in visual search. We particular focus on human action centred event detection. We propose an enhanced sparse coding scheme to model human actions. Our proposed approach is able to significantly reduce computational cost while achieving recognition accuracy highly comparable to the state-of-the-art methods. At last, we propose an integrated solution for addressing the prime challenges raised from large-scale interactive visual search. The proposed system is also one of the first attempts for exploratory visual search. It provides users more robust results to satisfy their exploring experiences

Sydney eScholarship