7,924 research outputs found

    A 3D scene analysis framework and descriptors for risk evaluation

    Get PDF
    In this paper we evaluate the notion of scene analysis with regard to risk. We consider the problem of evaluating risk and potential hazards in an environment and providing a quantified risk score. A definition of risk is given incorporating two elements; Firstly scene stability, where Newtonian Physics are introduced into the scene analysis process, evaluating object stability within a scene. The effectiveness of which is demonstrated by conducting experiments on several scenes including a variety of stability levels. Secondly the analysis of the intrinsic risk related properties of an object, which is estimated using learning techniques and the utilisation of the 3D Voxel HOG descriptor, analysed against the state-of-the-art descriptors. Finally a new dataset is provided that is designed for scene analysis focusing on risk evaluation

    Scene analysis and risk estimation for domestic robots, security and smart homes

    Get PDF
    The evaluation of risk within a scene is a new and emerging area of research. With the advent of smart enabled homes and the continued development and implementation of domestic robotics, the platform for automated risk assessment within the home is now a possibility. The aim of this thesis is to explore a subsection of the problems facing the detection and quantification of risk in a domestic setting. A Risk Estimation framework is introduced which provides a flexible and context aware platform from which measurable elements of risk can be combined to create a final risk score for a scene. To populate this framework, three elements of measurable risk are proposed and evaluated: Firstly, scene stability, assessing the location and stability of objects within an environment through the use of physics simulation techniques. Secondly, hazard feature analysis using two specifically designed novel feature descriptors (3D Voxel HOG and the Physics Behaviour Feature) which determine if the objects within a scene have dangerous or risky properties such as blades or points. Finally, environment interaction, which uses human behaviour simulation to predict human reactions to detected risks and highlight areas of a scene most likely to be visited. Additionally methodologies are introduced to support these concepts including: a simulation prediction framework which reduces the computational cost of physics simulation, a Robust Filter and Complex Adaboost which aim to improve the robustness and training times required for hazard feature classification models. The Human and Group Behaviour Evaluation framework is introduced to provide a platform from which simulation algorithms can be evaluated without the need for extensive ground truth data. Finally the 3D Risk Scenes (3DRS) dataset is introduced, creating a risk specific dataset for the evaluation of future domestic risk analysis methodologies

    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

    Full text link
    This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure

    Risk analysis for smart homes and domestic robots using robust shape and physics descriptors, and complex boosting techniques

    Get PDF
    In this paper, the notion of risk analysis within 3D scenes using vision based techniques is introduced. In particular the problem of risk estimation of indoor environments at the scene and object level is considered, with applications in domestic robots and smart homes. To this end, the proposed Risk Estimation Framework is described, which provides a quantified risk score for a given scene. This methodology is extended with the introduction of a novel robust kernel for 3D shape descriptors such as 3D HOG and SIFT3D, which aims to reduce the effects of outliers in the proposed risk recognition methodology. The Physics Behaviour Feature (PBF) is presented, which uses an object's angular velocity obtained using Newtonian physics simulation as a descriptor. Furthermore, an extension of boosting techniques for learning is suggested in the form of the novel Complex and Hyper-Complex Adaboost, which greatly increase the computation efficiency of the original technique. In order to evaluate the proposed robust descriptors an enriched version of the 3D Risk Scenes (3DRS) dataset with extra objects, scenes and meta-data was utilised. A comparative study was conducted demonstrating that the suggested approach outperforms current state-of-the-art descriptors

    A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction

    Full text link
    Picking up objects requested by a human user is a common task in human-robot interaction. When multiple objects match the user's verbal description, the robot needs to clarify which object the user is referring to before executing the action. Previous research has focused on perceiving user's multimodal behaviour to complement verbal commands or minimising the number of follow up questions to reduce task time. In this paper, we propose a system for reference disambiguation based on visualisation and compare three methods to disambiguate natural language instructions. In a controlled experiment with a YuMi robot, we investigated real-time augmentations of the workspace in three conditions -- mixed reality, augmented reality, and a monitor as the baseline -- using objective measures such as time and accuracy, and subjective measures like engagement, immersion, and display interference. Significant differences were found in accuracy and engagement between the conditions, but no differences were found in task time. Despite the higher error rates in the mixed reality condition, participants found that modality more engaging than the other two, but overall showed preference for the augmented reality condition over the monitor and mixed reality conditions

    Learning Descriptors for Object Recognition and 3D Pose Estimation

    Full text link
    Detecting poorly textured objects and estimating their 3D pose reliably is still a very challenging problem. We introduce a simple but powerful approach to computing descriptors for object views that efficiently capture both the object identity and 3D pose. By contrast with previous manifold-based approaches, we can rely on the Euclidean distance to evaluate the similarity between descriptors, and therefore use scalable Nearest Neighbor search methods to efficiently handle a large number of objects under a large range of poses. To achieve this, we train a Convolutional Neural Network to compute these descriptors by enforcing simple similarity and dissimilarity constraints between the descriptors. We show that our constraints nicely untangle the images from different objects and different views into clusters that are not only well-separated but also structured as the corresponding sets of poses: The Euclidean distance between descriptors is large when the descriptors are from different objects, and directly related to the distance between the poses when the descriptors are from the same object. These important properties allow us to outperform state-of-the-art object views representations on challenging RGB and RGB-D data.Comment: CVPR 201
    • …
    corecore