75,402 research outputs found

    Describing Textures in the Wild

    Get PDF
    Patterns and textures are defining characteristics of many natural objects: a shirt can be striped, the wings of a butterfly can be veined, and the skin of an animal can be scaly. Aiming at supporting this analytical dimension in image understanding, we address the challenging problem of describing textures with semantic attributes. We identify a rich vocabulary of forty-seven texture terms and use them to describe a large dataset of patterns collected in the wild.The resulting Describable Textures Dataset (DTD) is the basis to seek for the best texture representation for recognizing describable texture attributes in images. We port from object recognition to texture recognition the Improved Fisher Vector (IFV) and show that, surprisingly, it outperforms specialized texture descriptors not only on our problem, but also in established material recognition datasets. We also show that the describable attributes are excellent texture descriptors, transferring between datasets and tasks; in particular, combined with IFV, they significantly outperform the state-of-the-art by more than 8 percent on both FMD and KTHTIPS-2b benchmarks. We also demonstrate that they produce intuitive descriptions of materials and Internet images.Comment: 13 pages; 12 figures Fixed misplaced affiliatio

    Associating object names with descriptions of shape that distinguish possible from impossible objects.

    Get PDF
    Five experiments examine the proposal that object names are closely linked torepresentations of global, 3D shape by comparing memory for simple line drawings of structurally possible and impossible novel objects.Objects were rendered impossible through local edge violations to global coherence (cf. Schacter, Cooper, & Delaney, 1990) and supplementary observations confirmed that the sets of possible and impossible objects were matched for their distinctiveness. Employing a test of explicit recognition memory, Experiment 1 confirmed that the possible and impossible objects were equally memorable. Experiments 2–4 demonstrated that adults learn names (single-syllable non-words presented as count nouns, e.g., “This is a dax”) for possible objectsmore easily than for impossible objects, and an item-based analysis showed that this effect was unrelated to either the memorability or the distinctiveness of the individual objects. Experiment 3 indicated that the effects of object possibility on name learning were long term (spanning at least 2months), implying that the cognitive processes being revealed can support the learning of object names in everyday life. Experiment 5 demonstrated that hearing someone else name an object at presentation improves recognition memory for possible objects, but not for impossible objects. Taken together, the results indicate that object names are closely linked to the descriptions of global, 3D shape that can be derived for structurally possible objects but not for structurally impossible objects. In addition, the results challenge the view that object decision and explicit recognition necessarily draw on separate memory systems,with only the former being supported by these descriptions of global object shape. It seems that recognition also can be supported by these descriptions, provided the original encoding conditions encourage their derivation. Hearing an object named at encoding appears to be just such a condition. These observations are discussed in relation to the effects of naming in other visual tasks, and to the role of visual attention in object identification

    Active Clothing Material Perception using Tactile Sensing and Deep Learning

    Full text link
    Humans represent and discriminate the objects in the same category using their properties, and an intelligent robot should be able to do the same. In this paper, we build a robot system that can autonomously perceive the object properties through touch. We work on the common object category of clothing. The robot moves under the guidance of an external Kinect sensor, and squeezes the clothes with a GelSight tactile sensor, then it recognizes the 11 properties of the clothing according to the tactile data. Those properties include the physical properties, like thickness, fuzziness, softness and durability, and semantic properties, like wearing season and preferred washing methods. We collect a dataset of 153 varied pieces of clothes, and conduct 6616 robot exploring iterations on them. To extract the useful information from the high-dimensional sensory output, we applied Convolutional Neural Networks (CNN) on the tactile data for recognizing the clothing properties, and on the Kinect depth images for selecting exploration locations. Experiments show that using the trained neural networks, the robot can autonomously explore the unknown clothes and learn their properties. This work proposes a new framework for active tactile perception system with vision-touch system, and has potential to enable robots to help humans with varied clothing related housework.Comment: ICRA 2018 accepte

    Do You See What I Mean? Visual Resolution of Linguistic Ambiguities

    Get PDF
    Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a sentence given a visual scene which depicts one of the possible interpretations of that sentence. To this end, we introduce a new multimodal corpus containing ambiguous sentences, representing a wide range of syntactic, semantic and discourse ambiguities, coupled with videos that visualize the different interpretations for each sentence. We address this task by extending a vision model which determines if a sentence is depicted by a video. We demonstrate how such a model can be adjusted to recognize different interpretations of the same underlying sentence, allowing to disambiguate sentences in a unified fashion across the different ambiguity types.Comment: EMNLP 201

    A survey of visual preprocessing and shape representation techniques

    Get PDF
    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

    Eye movement patterns during the recognition of three-dimensional objects: Preferential fixation of concave surface curvature minima

    Get PDF
    This study used eye movement patterns to examine how high-level shape information is used during 3D object recognition. Eye movements were recorded while observers either actively memorized or passively viewed sets of novel objects, and then during a subsequent recognition memory task. Fixation data were contrasted against different algorithmically generated models of shape analysis based on: (1) regions of internal concave or (2) convex surface curvature discontinuity or (3) external bounding contour. The results showed a preference for fixation at regions of internal local features during both active memorization and passive viewing but also for regions of concave surface curvature during the recognition task. These findings provide new evidence supporting the special functional status of local concave discontinuities in recognition and show how studies of eye movement patterns can elucidate shape information processing in human vision
    • …
    corecore