Search CORE

26,664 research outputs found

Automatic Recognition of Concurrent and Coupled Human Motion Sequences

Author: Gehrig Dirk
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2015
Field of study

We developed methods and algorithms for all parts of a motion recognition system, i. e. Feature Extraction, Motion Segmentation and Labeling, Motion Primitive and Context Modeling as well as Decoding. We collected several datasets to compare our proposed methods with the state-of-the-art in human motion recognition. The main contributions of this thesis are a structured functional motion decomposition and a flexible and scalable motion recognition system suitable for a Humanoid Robot

KITopen

Learning spatio-temporal representations for action recognition: A genetic programming approach

Author: Li Xuelong
Liu Li
Lu Ke
Shao L
Shao Ling
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/02/2015
Field of study

Extracting discriminative and robust features from video sequences is the first and most critical step in human action recognition. In this paper, instead of using handcrafted features, we automatically learn spatio-temporal motion features for action recognition. This is achieved via an evolutionary method, i.e., genetic programming (GP), which evolves the motion feature descriptor on a population of primitive 3D operators (e.g., 3D-Gabor and wavelet). In this way, the scale and shift invariant features can be effectively extracted from both color and optical flow sequences. We intend to learn data adaptive descriptors for different datasets with multiple layers, which makes fully use of the knowledge to mimic the physical structure of the human visual cortex for action recognition and simultaneously reduce the GP searching space to effectively accelerate the convergence of optimal solutions. In our evolutionary architecture, the average cross-validation classification error, which is calculated by an support-vector-machine classifier on the training set, is adopted as the evaluation criterion for the GP fitness function. After the entire evolution procedure finishes, the best-so-far solution selected by GP is regarded as the (near-)optimal action descriptor obtained. The GP-evolving feature extraction method is evaluated on four popular action datasets, namely KTH, HMDB51, UCF YouTube, and Hollywood2. Experimental results show that our method significantly outperforms other types of features, either hand-designed or machine-learned

Northumbria Research Link

Crossref

Institutional Repository of Xi'an Institute of Optics and Precision Mechanics, CAS

University of East Anglia digital repository

Understanding of Object Manipulation Actions Using Human Multi-Modal Sensory Data

Author: Abbasi Bahareh
Noohi Ehsan
Parastegari Sina
Zefran Milos
Publication venue
Publication date: 01/01/2019
Field of study

Object manipulation actions represent an important share of the Activities of Daily Living (ADLs). In this work, we study how to enable service robots to use human multi-modal data to understand object manipulation actions, and how they can recognize such actions when humans perform them during human-robot collaboration tasks. The multi-modal data in this study consists of videos, hand motion data, applied forces as represented by the pressure patterns on the hand, and measurements of the bending of the fingers, collected as human subjects performed manipulation actions. We investigate two different approaches. In the first one, we show that multi-modal signal (motion, finger bending and hand pressure) generated by the action can be decomposed into a set of primitives that can be seen as its building blocks. These primitives are used to define 24 multi-modal primitive features. The primitive features can in turn be used as an abstract representation of the multi-modal signal and employed for action recognition. In the latter approach, the visual features are extracted from the data using a pre-trained image classification deep convolutional neural network. The visual features are subsequently used to train the classifier. We also investigate whether adding data from other modalities produces a statistically significant improvement in the classifier performance. We show that both approaches produce a comparable performance. This implies that image-based methods can successfully recognize human actions during human-robot collaboration. On the other hand, in order to provide training data for the robot so it can learn how to perform object manipulation actions, multi-modal data provides a better alternative

arXiv.org e-Print Archive

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

Content-based Video Retrieval by Integrating Spatio-Temporal and Stochastic Recognition of Events

Author: Jonker W.
Petkovic M.
Publication venue: IEEE
Publication date: 01/01/2001
Field of study

As amounts of publicly available video data grow the need to query this data efficiently becomes significant. Consequently content-based retrieval of video data turns out to be a challenging and important problem. We address the specific aspect of inferring semantics automatically from raw video data. In particular, we introduce a new video data model that supports the integrated use of two different approaches for mapping low-level features to high-level concepts. Firstly, the model is extended with a rule-based approach that supports spatio-temporal formalization of high-level concepts, and then with a stochastic approach. Furthermore, results on real tennis video data are presented, demonstrating the validity of both approaches, as well us advantages of their integrated us

CiteSeerX

Pure OAI Repository

University of Twente Research Information

Going Deeper into Action Recognition: A Survey

Author: Harandi Mehrtash
Herath Samitha
Porikli Fatih
Publication venue
Publication date: 01/01/2017
Field of study

Understanding human actions in visual data is tied to advances in complementary research areas including object recognition, human dynamics, domain adaptation and semantic segmentation. Over the last decade, human action analysis evolved from earlier schemes that are often limited to controlled environments to nowadays advanced solutions that can learn from millions of videos and apply to almost all daily activities. Given the broad range of applications from video surveillance to human-computer interaction, scientific milestones in action recognition are achieved more rapidly, eventually leading to the demise of what used to be good in a short time. This motivated us to provide a comprehensive review of the notable steps taken towards recognizing human actions. To this end, we start our discussion with the pioneering methods that use handcrafted representations, and then, navigate into the realm of deep learning based approaches. We aim to remain objective throughout this survey, touching upon encouraging improvements as well as inevitable fallbacks, in the hope of raising fresh questions and motivating new research directions for the reader

arXiv.org e-Print Archive

The Australian National University