41 research outputs found

    PlaNet-ClothPick: Effective Fabric Flattening Based on Latent Dynamic Planning

    Full text link
    Why do Recurrent State Space Models such as PlaNet fail at cloth manipulation tasks? Recent work has attributed this to the blurry prediction of the observation, which makes it difficult to plan directly in the latent space. This paper explores the reasons behind this by applying PlaNet in the pick-and-place fabric-flattening domain. We find that the sharp discontinuity of the transition function on the contour of the fabric makes it difficult to learn an accurate latent dynamic model, causing the MPC planner to produce pick actions slightly outside of the article. By limiting picking space on the cloth mask and training on specially engineered trajectories, our mesh-free PlaNet-ClothPick surpasses visual planning and policy learning methods on principal metrics in simulation, achieving similar performance as state-of-the-art mesh-based planning approaches. Notably, our model exhibits a faster action inference and requires fewer transitional model parameters than the state-of-the-art robotic systems in this domain. Other supplementary materials are available at: https://sites.google.com/view/planet-clothpick.Comment: 12 pages, 2 tables, and 14 figures. It has been accepted to The 2024 16th IEEE/SICE International Symposium on System Integration, Ha Long, Vietnam 8-11th January, 202

    Supervisor recommendation tool for Computer Science projects

    Get PDF
    In most Computer Science programmes, students are required to undertake an individual project under the guidance of a supervisor during their studies. With increasing student numbers, matching students to suitable supervisors is becoming an increasing challenge. This paper presents a software tool which assists Computer Science students in identifying the most suitable supervisor for their final year project. It does this by matching a list of keywords or a project proposal provided by the students to a list of keywords which were automatically extracted from freely available data for each potential supervisor. The tool was evaluated using both manual and user testing, with generally positive results and user feedback. 83% of respondents agree that the current implementation of the tool is accurate, with 67% saying it would be a useful tool to have when looking for a supervisor. The tool is currently being adapted for wider use in the School.Postprin

    Texture features for object salience

    Get PDF
    Although texture is important for many vision-related tasks, it is not used in most salience models. As a consequence, there are images where all existing salience algorithms fail. We introduce a novel set of texture features built on top of a fast model of complex cells in striate cortex, i.e., visual area V1. The texture at each position is characterised by the two-dimensional local power spectrum obtained from Gabor filters which are tuned to many scales and orientations. We then apply a parametric model and describe the local spectrum by the combination of two one-dimensional Gaussian approximations: the scale and orientation distributions. The scale distribution indicates whether the texture has a dominant frequency and what frequency it is. Likewise, the orientation distribution attests the degree of anisotropy. We evaluate the features in combination with the state-of-the-art VOCUS2 salience algorithm. We found that using our novel texture features in addition to colour improves AUC by 3.8% on the PASCAL-S dataset when compared to the colour-only baseline, and by 62% on a novel texture-based dataset. (C) 2017 Elsevier B.V. All rights reserved.EU [ICT-2009.2.1-270247

    Fast and accurate multi-scale keypoints based on end-stopped cells

    Get PDF
    Increasingly more applications in computer vision employ interest points. Algorithms like SIFT and SURF are all based on partial derivatives of images smoothed with Gaussian filter kemels. These algorithrns are fast and therefore very popular

    Fast cortical keypoints for real-time object recognition

    Get PDF
    Best-performing object recognition algorithms employ a large number features extracted on a dense grid, so they are too slow for real-time and active vision. In this paper we present a fast cortical keypoint detector for extracting meaningful points from images. It is competitive with state-of-the-art detectors and particularly well-suited for tasks such as object recognition. We show that by using these points we can achieve state-of-the-art categorization results in a fraction of the time required by competing algorithms

    Phase-differencing in stereo vision: solving the localisation problem

    Get PDF
    Complex Gabor filters with phases in quadrature are often used to model even- and odd-symmetric simple cells in the primary visual cortex. In stereo vision, the phase difference between the responses of the left and right views can be used to construct a disparity or depth map. Various constraints can be applied in order to construct smooth maps, but this leads to very imprecise depth transitions. In this theoretical paper we show, by using lines and edges as image primitives, the origin of the localisation problem. We also argue that disparity should be attributed to lines and edges, rather than trying to construct a 3D surface map in cortical area V1. We derive allowable translation ranges which yield correct disparity estimates, both for left-view centered vision and for cyclopean vision

    A biological and real-time framework for hand gestures and head poses

    Get PDF
    Human-robot interaction is an interdisciplinary research area that aims at the development of social robots. Since social robots are expected to interact with humans and understand their behavior through gestures and body movements, cognitive psychology and robot technology must be integrated. In this paper we present a biological and real-time framework for detecting and tracking hands and heads. This framework is based on keypoints extracted by means of cortical V1 end-stopped cells. Detected keypoints and the cells’ responses are used to classify the junction type. Through the combination of annotated keypoints in a hierarchical, multi-scale tree structure, moving and deformable hands can be segregated and tracked over time. By using hand templates with lines and edges at only a few scales, a hand’s gestures can be recognized. Head tracking and pose detection are also implemented, which can be integrated with detection of facial expressions in the future. Through the combinations of head poses and hand gestures a large number of commands can be given to a robot

    Multi-scale cortical keypoints for realtime hand tracking and gesture recognition

    Get PDF
    Human-robot interaction is an interdisciplinary research area which aims at integrating human factors, cognitive psychology and robot technology. The ultimate goal is the development of social robots. These robots are expected to work in human environments, and to understand behavior of persons through gestures and body movements. In this paper we present a biological and realtime framework for detecting and tracking hands. This framework is based on keypoints extracted from cortical V1 end-stopped cells. Detected keypoints and the cells’ responses are used to classify the junction type. By combining annotated keypoints in a hierarchical, multi-scale tree structure, moving and deformable hands can be segregated, their movements can be obtained, and they can be tracked over time. By using hand templates with keypoints at only two scales, a hand’s gestures can be recognized

    A disparity energy model improved by line, edge and keypoint correspondences

    Get PDF
    Disparity energy models (DEMs) estimate local depth information on the basis ofVl complex cells. Our recent DEM (Martins et al, 2011 ISSPlT261-266) employs a population code. Once the population's cells have been trained with randorn-dot stereograms, it is applied at all retinotopic positions in the visual field. Despite producing good results in textured regions, the model needs to be made more precise, especially at depth transitions
    corecore