154 research outputs found
RGB-D-based Action Recognition Datasets: A Survey
Human action recognition from RGB-D (Red, Green, Blue and Depth) data has
attracted increasing attention since the first work reported in 2010. Over this
period, many benchmark datasets have been created to facilitate the development
and evaluation of new algorithms. This raises the question of which dataset to
select and how to use it in providing a fair and objective comparative
evaluation against state-of-the-art methods. To address this issue, this paper
provides a comprehensive review of the most commonly used action recognition
related RGB-D video datasets, including 27 single-view datasets, 10 multi-view
datasets, and 7 multi-person datasets. The detailed information and analysis of
these datasets is a useful resource in guiding insightful selection of datasets
for future research. In addition, the issues with current algorithm evaluation
vis-\'{a}-vis limitations of the available datasets and evaluation protocols
are also highlighted; resulting in a number of recommendations for collection
of new datasets and use of evaluation protocols
A preliminary study of micro-gestures:dataset collection and analysis with multi-modal dynamic networks
Abstract. Micro-gestures (MG) are gestures that people performed spontaneously during communication situations. A preliminary exploration of Micro-Gesture is made in this thesis. By collecting recorded sequences of body gestures in a spontaneous state during games, a MG dataset is built through Kinect V2. A novel term ‘micro-gesture’ is proposed by analyzing the properties of MG dataset. Implementations of two sets of neural network architectures are achieved for micro-gestures segmentation and recognition task, which are the DBN-HMM model and the 3DCNN-HMM model for skeleton data and RGB-D data respectively. We also explore a method for extracting neutral states used in the HMM structure by detecting the activity level of the gesture sequences. The method is simple to derive and implement, and proved to be effective. The DBN-HMM and 3DCNN-HMM architectures are evaluated on MG dataset and optimized for the properties of micro-gestures. Experimental results show that we are able to achieve micro-gesture segmentation and recognition with satisfied accuracy with these two models. The work we have done about the micro-gestures in this thesis also explores a new research path for gesture recognition. Therefore, we believe that our work could be widely used as a baseline for future research on micro-gestures
Articulated motion and deformable objects
This guest editorial introduces the twenty two papers accepted for this Special Issue on Articulated Motion and Deformable Objects (AMDO). They are grouped into four main categories within the field of AMDO: human motion analysis (action/gesture), human pose estimation, deformable shape segmentation, and face analysis. For each of the four topics, a survey of the recent developments in the field is presented. The accepted papers are briefly introduced in the context of this survey. They contribute novel methods, algorithms with improved performance as measured on benchmarking datasets, as well as two new datasets for hand action detection and human posture analysis. The special issue should be of high relevance to the reader interested in AMDO recognition and promote future research directions in the field
Recommended from our members
Exploring Engineering Applications of Visual Analytics in Virtual Reality
Recent advancements and technological breakthroughs in the development of so-called immersive interfaces, such as augmented (AR), mixed (MR), and virtual reality (VR), coupled with the growing mass-market adoption of such devices has started to attract attention from academia and industry alike. Out of these technologies, VR offers the most mature option in terms of both hardware and software, as well as the best available range of different off-the-shelf offerings. VR is a term interchangeably used to denote both head-mounted displays (HMDs) and fully immersive, bespoke 3D environments which these devices transport their users to. With modern devices, developers can leverage a range of different interaction modalities, including visual, audio, and even haptic feedback, in the creation of these virtual worlds. With such a rich interaction space it is thus natural to think of VR as a well-suited environment for interactive visualisation and analytical reasoning of complex multidimensional data.
Research in \textit{visual analytics} (VA) combines these two themes, spanning the last one and a half decades, and has revealed a number of research findings. This includes a range of new advanced and effective visualisation and analysis tools for even more complex, more noisy and larger data sets. Furthermore, the extension of this research and the use of immersive interfaces to facilitate visual analytics has spun-off a new field of research: \textit{immersive analytics} (IA). Immersive analytics leverages the potential bestowed by immersive interfaces to aid the user in swift and effective data analysis.
Some of the most promising application domains of such immersive interfaces in the industry are various branches of engineering, including aerospace design and in civil engineering. The range of potential applications is vast and growing as new stakeholders are adopting these immersive tools. However, the use of these technologies brings its own challenges. One such difficulty is the design of appropriate interaction techniques. There is no optimal choice, instead such a choice varies depending on available hardware, the user’s prior experience, their task at hand, and the nature of the dataset.
To this end, my PhD work has focused on designing and analysing various interactive, VR-based immersive systems for engineering visual analytics. One of the key elements of such an immersive system is the selection of an adequate interaction method. In a series of both qualitative and quantitative studies, I have explored the potential of various interaction techniques that can be used to support the user in swift and effective data analysis.
Here, I have investigated the feasibility of using techniques such as hand-held controllers, gaze-tracking and hand-tracking input methods used solo or in combination in various challenging use cases and scenarios. For instance, I developed and verified the usability and effectiveness of the AeroVR system for aerospace design in VR. This research has allowed me to trim the very large design space of such systems that have been not sufficiently explored thus far. Moreover, building on top of this work, I have designed, developed, and tested a system for digital twin assessment in aerospace that coupled gaze-tracking and hand-tracking, achieved via an additional sensor attached to the front of the VR headset, with no need for the user to hold a controller. The analysis of the results obtained from a qualitative study with domain experts allowed me to distill and propose design implications when developing similar systems. Furthermore, I worked towards designing an effective VR-based visualisation of complex, multidimensional abstract datasets. Here, I developed and evaluated the immersive version of the well-known Parallel Coordinates Plots (IPCP) visualisation technique. The results of the series of qualitative user studies allowed me to obtain a list of design suggestions for IPCP, as well as provide tentative evidence that the IPCP can be an effective tool for multidimensional data analysis. Lastly, I also worked on the design, development, and verification of the system allowing its users to capture information in the context of conducting engineering surveys in VR.
Furthermore, conducting a meaningful evaluation of immersive analytics interfaces remains an open problem. It is difficult and often not feasible to use traditional A/B comparisons in controlled experiments as the aim of immersive analytics is to provide its users with new insights into their data rather than focusing on more quantifying factors. To this end, I developed a generative process for synthesising clustered datasets for VR analytics experiments that can be used in the process of interface evaluation. I further validated this approach by designing and carrying out two user studies. The statistical analysis of the gathered data revealed that this generative process for synthesising clustered datasets did indeed result in datasets that can be used in experiments without the datasets themselves being the dominant contributor of the variability between conditions.Engineering and Physical Sciences Research Council (EPSRC-1788814); Trinity Hall and Cambridge Commonwealth, European & International Trust; Cambridge Philosophical Societ
Spatiotemporal analysis of human actions using RGB-D cameras
Markerless human motion analysis has strong potential to provide cost-efficient solution for action recognition and body pose estimation. Many applications including humancomputer interaction, video surveillance, content-based video indexing, and automatic annotation among others will benefit from a robust solution to these problems. Depth sensing technologies in recent years have positively changed the climate of the automated vision-based human action recognition problem, deemed to be very difficult due to the various ambiguities inherent to conventional video. In this work, first a large set of invariant spatiotemporal features is extracted from skeleton joints (retrieved from depth sensor) in motion and evaluated as baseline performance. Next we introduce a discriminative Random Decision Forest-based feature selection framework capable of reaching impressive action recognition performance when combined with a linear SVM classifier. This approach improves upon the baseline performance obtained using the whole feature set with a significantly less number of features (one tenth of the original). The approach can also be used to provide insights on the spatiotemporal dynamics of human actions. A novel therapeutic action recognition dataset (WorkoutSU-10) is presented. We took advantage of this dataset as a benchmark in our tests to evaluate the reliability of our proposed methods. Recently the dataset has been published publically as a contribution to the action recognition community. In addition, an interactive action evaluation application is developed by utilizing the proposed methods to help with real life problems such as 'fall detection' in the elderly people or automated therapy program for patients with motor disabilities
Recommended from our members
Fingers micro-gesture recognition based on holoscopic 3D imaging system
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonMicro-gesture recognition has been widely research in recent years, in particular there
has been a great focus on 3D micro-gesture recognition which consists of classifying the
micro-gesture movements of the fingers for touch-less control applications. Holoscopic
3D imaging system mimics fly’s eye technique to capture true 3D scene which is enrich
in both texture and motion information. As a result, holoscopic 3D imaging system shall
be a suitable approach for robust recognition application. This PhD research focuses on
innovative 3D micro-gesture recognition based on holoscopic 3D system which delivers
robust and reliable performance with precision for 3D micro-gestures. Indeed this can
be applied to other wide range of applications such as Internet of things (IoT), AR/VR,
robotics and other touch-less interaction.
Due to lack of holoscopic 3D dataset, a comprehensive 3D micro-gesture dataset (HoMG)
includes both holoscopic 3D images and videos is prepared. It is a reasonable size holoscopic
3D dataset which is captured with different camera settings and conditions from
40 participants. Innovative 3D micro-gesture recognition is proposed based on 2D feature
extraction methods with basic classification methods, the recognition accuracy can reach
around 50.9%. For video-based data, the 3D feature extraction methods are achieved
66.7% recognition accuracy over 50.9% accuracy for micro-gesture images as the initial
investigation. HoMG database held a challenge in IEEE International automatic face and
gesture 2018, and 4 groups from the international research institutes joined the challenge
and contributed many new methods as further development where the proposed method
was published.
The holoscopic 3D dataset further enrich innovative micro-gesture 3D recognition system
is proposed and its performance is evaluated by carrying out like to like comparison
with state of the art methods. In addition, a fast and efficient pre-processing algorithm
for H3D images to extract the element images. Simplified viewpoint image extraction
method are presented. A pre-trained CNN model with the attention mechanics is implemented
based on VP image for the predicted probabilities of gesture. The proposed
approached is further improved using voting strategy. The proposed approach achieves
87% accuracy, which outperform all existing state of the art methods on the image-based
database. Advanced 3D micro-gesture recognition is investigated based on sequence video database,
the end-to-end model has been used on effective H3D based micro-gesture recognition
system. For front-end network, there are two method of traditional viewpoint image
extraction and novel pseudo viewpoint image extraction have been used and evaluated.
The pseudo viewpoint (PVP) front-end has been created, which used to deep learning
networks understanding the implied 3D information of H3D imaging system. The viewpoint
(VP) front-end follows the traditional H3D image method to extract and reconstruct
the multi-viewpoint images. Both front-end have been feed in four popular advanced
deep networks using for learning and classification. This experiments evaluated the performance
of 2D/3D convolutional, mixing 2D and 3D convolutional and LSTM on the
HoMG video database, which is beneficial to H3D imaging system using deep learning
network. Finally, in order to obtain the high accuracies, the majority voting has been applied
for further improve. The final results show that the performance is not only better
than the traditional methods, but also superior to the existing deep learning based approaches,
which clearly demonstrates the effectiveness of the proposed approach
- …