5,384 research outputs found

    Single-Shot Clothing Category Recognition in Free-Configurations with Application to Autonomous Clothes Sorting

    Get PDF
    This paper proposes a single-shot approach for recognising clothing categories from 2.5D features. We propose two visual features, BSP (B-Spline Patch) and TSD (Topology Spatial Distances) for this task. The local BSP features are encoded by LLC (Locality-constrained Linear Coding) and fused with three different global features. Our visual feature is robust to deformable shapes and our approach is able to recognise the category of unknown clothing in unconstrained and random configurations. We integrated the category recognition pipeline with a stereo vision system, clothing instance detection, and dual-arm manipulators to achieve an autonomous sorting system. To verify the performance of our proposed method, we build a high-resolution RGBD clothing dataset of 50 clothing items of 5 categories sampled in random configurations (a total of 2,100 clothing samples). Experimental results show that our approach is able to reach 83.2\% accuracy while classifying clothing items which were previously unseen during training. This advances beyond the previous state-of-the-art by 36.2\%. Finally, we evaluate the proposed approach in an autonomous robot sorting system, in which the robot recognises a clothing item from an unconstrained pile, grasps it, and sorts it into a box according to its category. Our proposed sorting system achieves reasonable sorting success rates with single-shot perception.Comment: 9 pages, accepted by IROS201

    Confident Kernel Sparse Coding and Dictionary Learning

    Full text link
    In recent years, kernel-based sparse coding (K-SRC) has received particular attention due to its efficient representation of nonlinear data structures in the feature space. Nevertheless, the existing K-SRC methods suffer from the lack of consistency between their training and test optimization frameworks. In this work, we propose a novel confident K-SRC and dictionary learning algorithm (CKSC) which focuses on the discriminative reconstruction of the data based on its representation in the kernel space. CKSC focuses on reconstructing each data sample via weighted contributions which are confident in its corresponding class of data. We employ novel discriminative terms to apply this scheme to both training and test frameworks in our algorithm. This specific design increases the consistency of these optimization frameworks and improves the discriminative performance in the recall phase. In addition, CKSC directly employs the supervised information in its dictionary learning framework to enhance the discriminative structure of the dictionary. For empirical evaluations, we implement our CKSC algorithm on multivariate time-series benchmarks such as DynTex++ and UTKinect. Our claims regarding the superior performance of the proposed algorithm are justified throughout comparing its classification results to the state-of-the-art K-SRC algorithms.Comment: 10 pages, ICDM 2018 conferenc

    The evolution of a visual-to-auditory sensory substitution device using interactive genetic algorithms

    Get PDF
    Sensory Substitution is a promising technique for mitigating the loss of a sensory modality. Sensory Substitution Devices (SSDs) work by converting information from the impaired sense (e.g. vision) into another, intact sense (e.g. audition). However, there are a potentially infinite number of ways of converting images into sounds and it is important that the conversion takes into account the limits of human perception and other user-related factors (e.g. whether the sounds are pleasant to listen to). The device explored here is termed “polyglot” because it generates a very large set of solutions. Specifically, we adapt a procedure that has been in widespread use in the design of technology but has rarely been used as a tool to explore perception – namely Interactive Genetic Algorithms. In this procedure, a very large range of potential sensory substitution devices can be explored by creating a set of ‘genes’ with different allelic variants (e.g. different ways of translating luminance into loudness). The most successful devices are then ‘bred’ together and we statistically explore the characteristics of the selected-for traits after multiple generations. The aim of the present study is to produce design guidelines for a better SSD. In three experiments we vary the way that the fitness of the device is computed: by asking the user to rate the auditory aesthetics of different devices (Experiment 1), by measuring the ability of participants to match sounds to images (Experiment 2) and the ability to perceptually discriminate between two sounds derived from similar images (Experiment 3). In each case the traits selected for by the genetic algorithm represent the ideal SSD for that task. Taken together, these traits can guide the design of a better SSD

    Vision-Based Tactile Paving Detection Method in Navigation Systems for Visually Impaired Persons

    Get PDF
    In general, a visually impaired person relies on guide canes in order to walk outside besides depending only on a tactile pavement as a warning and directional tool in order to avoid any obstructions or hazardous situations. However, still a lot of training is needed in order to recognize the tactile pattern, and it is quite difficult for persons who have recently become visually impaired. This chapter describes the development and evaluation of vision-based tactile paving detection method for visually impaired persons. Some experiments will be conducted on how it works to detect the tactile pavement and identify the shape of tactile pattern. In this experiment, a vision-based method is proposed by using MATLAB including the Arduino platform and speaker as guidance tools. The output of this system based on the result found from tactile detection in MATLAB then produces auditory output and notifies the visually impaired about the type of tactile detected. Consequently, the development of tactile pavement detection system can be used by visually impaired persons for easy detection and navigation purposes

    Cognitive Information Processing

    Get PDF
    Contains research objectives and reports on five research projects.Joint Services Electronics Programs (U. S. Army, U.S. Navy, and U.S. Air Force) under Contract DA 36-039-AMC-03200(E)National Science Foundation (Grant GK-835)National Institutes of Health (Grant 2 P01 MH-04737-06)National Aeronautics and Space Administration (Grant NsG 496

    Exploiting Prior Knowledge in Compressed Sensing Wireless ECG Systems

    Full text link
    Recent results in telecardiology show that compressed sensing (CS) is a promising tool to lower energy consumption in wireless body area networks for electrocardiogram (ECG) monitoring. However, the performance of current CS-based algorithms, in terms of compression rate and reconstruction quality of the ECG, still falls short of the performance attained by state-of-the-art wavelet based algorithms. In this paper, we propose to exploit the structure of the wavelet representation of the ECG signal to boost the performance of CS-based methods for compression and reconstruction of ECG signals. More precisely, we incorporate prior information about the wavelet dependencies across scales into the reconstruction algorithms and exploit the high fraction of common support of the wavelet coefficients of consecutive ECG segments. Experimental results utilizing the MIT-BIH Arrhythmia Database show that significant performance gains, in terms of compression rate and reconstruction quality, can be obtained by the proposed algorithms compared to current CS-based methods.Comment: Accepted for publication at IEEE Journal of Biomedical and Health Informatic

    Spontaneous Analogy by Piggybacking on a Perceptual System

    Full text link
    Most computational models of analogy assume they are given a delineated source domain and often a specified target domain. These systems do not address how analogs can be isolated from large domains and spontaneously retrieved from long-term memory, a process we call spontaneous analogy. We present a system that represents relational structures as feature bags. Using this representation, our system leverages perceptual algorithms to automatically create an ontology of relational structures and to efficiently retrieve analogs for new relational structures from long-term memory. We provide a demonstration of our approach that takes a set of unsegmented stories, constructs an ontology of analogical schemas (corresponding to plot devices), and uses this ontology to efficiently find analogs within new stories, yielding significant time-savings over linear analog retrieval at a small accuracy cost.Comment: Proceedings of the 35th Meeting of the Cognitive Science Society, 201

    Mathematical Methods Applied to Digital Image Processing

    Get PDF
    Introduction: Digital image processing (DIP) is an important research area since it spans a variety of applications. Although over the past few decades there has been a rapid rise in this field, there still remain issues to address. Examples include image coding, image restoration, 3D image processing, feature extraction and analysis, moving object detection, and face recognition. To deal with these issues, the use of sophisticated and robust mathematical algorithms plays a crucial role. The aim of this special issue is to provide an opportunity for researchers to publish their latest theoretical and technological achievements in mathematical methods and their various applications related to DIP. This special issue covers topics related to the development of mathematical methods and their applications. It has a total of twenty-four high-quality papers covering various important topics in DIP, including image preprocessing, image encoding/decoding, stereo image reconstruction, dimensionality and data size reduction, and applications
    corecore