2,669 research outputs found
Boosted Multiple Kernel Learning for First-Person Activity Recognition
Activity recognition from first-person (ego-centric) videos has recently
gained attention due to the increasing ubiquity of the wearable cameras. There
has been a surge of efforts adapting existing feature descriptors and designing
new descriptors for the first-person videos. An effective activity recognition
system requires selection and use of complementary features and appropriate
kernels for each feature. In this study, we propose a data-driven framework for
first-person activity recognition which effectively selects and combines
features and their respective kernels during the training. Our experimental
results show that use of Multiple Kernel Learning (MKL) and Boosted MKL in
first-person activity recognition problem exhibits improved results in
comparison to the state-of-the-art. In addition, these techniques enable the
expansion of the framework with new features in an efficient and convenient
way.Comment: First published in the Proceedings of the 25th European Signal
Processing Conference (EUSIPCO-2017) in 2017, published by EURASI
Recognising Complex Activities with Histograms of Relative Tracklets
One approach to the recognition of complex human activities is to use feature descriptors that encode visual inter-actions by describing properties of local visual features with respect to trajectories of tracked objects. We explore an example of such an approach in which dense tracklets are described relative to multiple reference trajectories, providing a rich representation of complex interactions between objects of which only a subset can be tracked. Specifically, we report experiments in which reference trajectories are provided by tracking inertial sensors in a food preparation sce-nario. Additionally, we provide baseline results for HOG, HOF and MBH, and combine these features with others for multi-modal recognition. The proposed histograms of relative tracklets (RETLETS) showed better activity recognition performance than dense tracklets, HOG, HOF, MBH, or their combination. Our comparative evaluation of features from accelerometers and video highlighted a performance gap between visual and accelerometer-based motion features and showed a substantial performance gain when combining features from these sensor modalities. A considerable further performance gain was observed in combination with RETLETS and reference tracklet features
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
- …