4 research outputs found

    A 0.6V, 8mW 3D Vision Processor for a Navigation Device for the Visually Impaired

    Get PDF
    This paper presents an energy-efficient computer vision processor for a navigation device for the visually impaired. Utilizing a shared parallel datapath, out-of-order processing and co-optimization with hardware-oriented algorithms, the processor consumes 8mW at 0.6V while processing 30 fps input data stream in real time. The test chip fabricated in 40nm is demonstrated as a core part of a navigation device based on a ToF camera, which successfully detects safe areas and obstacles.Texas Instruments Incorporate

    Action Recognition: From Static Datasets to Moving Robots

    Get PDF
    Deep learning models have achieved state-of-the- art performance in recognizing human activities, but often rely on utilizing background cues present in typical computer vision datasets that predominantly have a stationary camera. If these models are to be employed by autonomous robots in real world environments, they must be adapted to perform independently of background cues and camera motion effects. To address these challenges, we propose a new method that firstly generates generic action region proposals with good potential to locate one human action in unconstrained videos regardless of camera motion and then uses action proposals to extract and classify effective shape and motion features by a ConvNet framework. In a range of experiments, we demonstrate that by actively proposing action regions during both training and testing, state-of-the-art or better performance is achieved on benchmarks. We show the outperformance of our approach compared to the state-of-the-art in two new datasets; one emphasizes on irrelevant background, the other highlights the camera motion. We also validate our action recognition method in an abnormal behavior detection scenario to improve workplace safety. The results verify a higher success rate for our method due to the ability of our system to recognize human actions regardless of environment and camera motion

    Adaptive human-centered representation for activity recognition of multiple individuals from 3D point cloud sequences

    No full text
    Abstract — Activity recognition of multi-individuals (ARMI) within a group, which is essential to practical human-centered robotics applications such as childhood education, is a particu-larly challenging and previously not well studied problem. We present a novel adaptive human-centered (AdHuC) representa-tion based on local spatio-temporal features (LST) to address ARMI in a sequence of 3D point clouds. Our human-centered detector constructs affiliation regions to associate LST features with humans by mining depth data and using a cascade of rejec-tors to localize humans in 3D space. Then, features are detected within each affiliation region, which avoids extracting irrelevant features from dynamic background clutter and addresses mov-ing cameras on mobile robots. Our feature descriptor is able to adapt its support region to linear perspective view variations and encode multi-channel information (i.e., color and depth) to construct the final representation. Empirical studies validate that the AdHuC representation obtains promising performance on ARMI using an Meka humanoid robot to play multi-people Simon Says games. Experiments on benchmark datasets further demonstrate that our adaptive human-centered representation outperforms previous approaches for activity recognition from color-depth data. I

    Robot Learning from Human Demonstration: Interpretation, Adaptation, and Interaction

    Get PDF
    Robot Learning from Demonstration (LfD) is a research area that focuses on how robots can learn new skills by observing how people perform various activities. As humans, we have a remarkable ability to imitate other human’s behaviors and adapt to new situations. Endowing robots with these critical capabilities is a significant but very challenging problem considering the complexity and variation of human activities in highly dynamic environments. This research focuses on how robots can learn new skills by interpreting human activities, adapting the learned skills to new situations, and naturally interacting with humans. This dissertation begins with a discussion of challenges in each of these three problems. A new unified representation approach is introduced to enable robots to simultaneously interpret the high-level semantic meanings and generalize the low-level trajectories of a broad range of human activities. An adaptive framework based on feature space decomposition is then presented for robots to not only reproduce skills, but also autonomously and efficiently adjust the learned skills to new environments that are significantly different from demonstrations. To achieve natural Human Robot Interaction (HRI), this dissertation presents a Recurrent Neural Network based deep perceptual control approach, which is capable of integrating multi-modal perception sequences with actions for robots to interact with humans in long-term tasks. Overall, by combining the above approaches, an autonomous system is created for robots to acquire important skills that can be applied to human-centered applications. Finally, this dissertation concludes with a discussion of future directions that could accelerate the upcoming technological revolution of robot learning from human demonstration
    corecore