18,694 research outputs found
Action tube extraction based 3D-CNN for RGB-D action recognition
In this paper we propose a novel action tube extractor for RGB-D action recognition in trimmed videos. The action tube extractor takes as input a video and outputs an action tube. The method consists of two parts: spatial tube extraction and temporal sampling. The first part is built upon MobileNet-SSD and its role is to define the spatial region where the action takes place. The second part is based on the structural similarity index (SSIM) and is designed to remove frames without obvious motion from the primary action tube. The final extracted action tube has two benefits: 1) a higher ratio of ROI (subjects of action) to background; 2) most frames contain obvious motion change. We propose to use a two-stream (RGB and Depth) I3D architecture as our 3D-CNN model. Our approach outperforms the state-of-the-art methods on the OA and NTU RGB-D datasets. © 2018 IEEE.Peer ReviewedPostprint (published version
What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots
For robots that have the capability to interact with the physical environment
through their end effectors, understanding the surrounding scenes is not merely
a task of image classification or object recognition. To perform actual tasks,
it is critical for the robot to have a functional understanding of the visual
scene. Here, we address the problem of localizing and recognition of functional
areas from an arbitrary indoor scene, formulated as a two-stage deep learning
based detection pipeline. A new scene functionality testing-bed, which is
complied from two publicly available indoor scene datasets, is used for
evaluation. Our method is evaluated quantitatively on the new dataset,
demonstrating the ability to perform efficient recognition of functional areas
from arbitrary indoor scenes. We also demonstrate that our detection model can
be generalized onto novel indoor scenes by cross validating it with the images
from two different datasets
- …