1,256 research outputs found

    Skip-Thought Vectors

    Full text link
    We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion method to encode words that were not seen as part of training, allowing us to expand our vocabulary to a million words. After training our model, we extract and evaluate our vectors with linear models on 8 tasks: semantic relatedness, paraphrase detection, image-sentence ranking, question-type classification and 4 benchmark sentiment and subjectivity datasets. The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice. We will make our encoder publicly available.Comment: 11 page

    What are the shapes of response time distributions in visual search?

    Get PDF
    Many visual search experiments measure response time (RT) as their primary dependent variable. Analyses typically focus on mean (or median) RT. However, given enough data, the RT distribution can be a rich source of information. For this paper, we collected about 500 trials per cell per observer for both target-present and target-absent displays in each of three classic search tasks: feature search, with the target defined by color; conjunction search, with the target defined by both color and orientation; and spatial configuration search for a 2 among distractor 5s. This large data set allows us to characterize the RT distributions in detail. We present the raw RT distributions and fit several psychologically motivated functions (ex-Gaussian, ex-Wald, Gamma, and Weibull) to the data. We analyze and interpret parameter trends from these four functions within the context of theories of visual search

    A Tree-Based Context Model for Object Recognition

    Get PDF
    There has been a growing interest in exploiting contextual information in addition to local features to detect and localize multiple object categories in an image. A context model can rule out some unlikely combinations or locations of objects and guide detectors to produce a semantically coherent interpretation of a scene. However, the performance benefit of context models has been limited because most of the previous methods were tested on datasets with only a few object categories, in which most images contain one or two object categories. In this paper, we introduce a new dataset with images that contain many instances of different object categories, and propose an efficient model that captures the contextual information among more than a hundred object categories using a tree structure. Our model incorporates global image features, dependencies between object categories, and outputs of local detectors into one probabilistic framework. We demonstrate that our context model improves object recognition performance and provides a coherent interpretation of a scene, which enables a reliable image querying system by multiple object categories. In addition, our model can be applied to scene understanding tasks that local detectors alone cannot solve, such as detecting objects out of context or querying for the most typical and the least typicalscenes in a dataset.This research was partially funded by Shell International Exploration and Production Inc., by Army Research Office under award W911NF-06-1-0076, by NSF Career Award (ISI 0747120), and by the Air Force Office of Scientific Research under Award No.FA9550-06-1-0324. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the Air Force

    SegICP: Integrated Deep Semantic Segmentation and Pose Estimation

    Full text link
    Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios. To improve these systems' perceptive speed and robustness, we present SegICP, a novel integrated solution to object recognition and pose estimation. SegICP couples convolutional neural networks and multi-hypothesis point cloud registration to achieve both robust pixel-wise semantic segmentation as well as accurate and real-time 6-DOF pose estimation for relevant objects. Our architecture achieves 1cm position error and <5^\circ$ angle error in real time without an initial seed. We evaluate and benchmark SegICP against an annotated dataset generated by motion capture.Comment: IROS camera-read

    Modelling search for people in 900 scenes: A combined source model of eye guidance

    Get PDF
    How predictable are human eye movements during search in real world scenes? We recorded 14 observers’ eye movements as they performed a search task (person detection) in 912 outdoor scenes. Observers were highly consistent in the regions fixated during search, even when the target was absent from the scene. These eye movements were used to evaluate computational models of search guidance from three sources: Saliency, target features, and scene context. Each of these models independently outperformed a cross-image control in predicting human fixations. Models that combined sources of guidance ultimately predicted 94% of human agreement, with the scene context component providing the most explanatory power. None of the models, however, could reach the precision and fidelity of an attentional map defined by human fixations. This work puts forth a benchmark for computational models of search in real world scenes. Further improvements in modelling should capture mechanisms underlying the selectivity of observers’ fixations during search.National Eye Institute (Integrative Training Program in Vision grant T32 EY013935)Massachusetts Institute of Technology (Singleton Graduate Research Fellowship)National Science Foundation (U.S.) (Graduate Research Fellowship)National Science Foundation (U.S.) (CAREER Award (0546262))National Science Foundation (U.S.) (NSF contract (0705677))National Science Foundation (U.S.) (Career Award (0747120)

    Accidental Pinhole and Pinspeck Cameras

    Get PDF
    We identify and study two types of “accidental” images that can be formed in scenes. The first is an accidental pinhole camera image. The second class of accidental images are “inverse” pinhole camera images, formed by subtracting an image with a small occluder present from a reference image without the occluder. Both types of accidental cameras happen in a variety of different situations. For example, an indoor scene illuminated by natural light, a street with a person walking under the shadow of a building, etc. The images produced by accidental cameras are often mistaken for shadows or interreflections. However, accidental images can reveal information about the scene outside the image, the lighting conditions, or the aperture by which light enters the scene.National Science Foundation (U.S.) (CAREER Award 0747120)United States. Office of Naval Research. Multidisciplinary University Research Initiative (N000141010933)National Science Foundation (U.S.) (CGV 1111415)National Science Foundation (U.S.) (CGV 0964004

    What have we learnt from EUPORIAS climate service prototypes?

    Get PDF
    The international effort toward climate services, epitomised by the development of the Global Framework for Climate Services and, more recently the launch of Copernicus Climate Change Service has renewed interest in the users and the role they can play in shaping the services they will eventually use. Here we critically analyse the results of the five climate service prototypes that were developed as part of the EU funded project EUPORIAS. Starting from the experience acquired in each of the projects we attempt to distil a few key lessons which, we believe, will be relevant to the wider community of climate service developers

    ImageNet Large Scale Visual Recognition Challenge

    Get PDF
    The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the five years of the challenge, and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL VOC (per-category comparisons in Table 3, distribution of localization difficulty in Fig 16), a list of queries used for obtaining object detection images (Appendix C), and some additional reference

    Highlights from the Pierre Auger Observatory

    Full text link
    The Pierre Auger Observatory is the world's largest cosmic ray observatory. Our current exposure reaches nearly 40,000 km2^2 str and provides us with an unprecedented quality data set. The performance and stability of the detectors and their enhancements are described. Data analyses have led to a number of major breakthroughs. Among these we discuss the energy spectrum and the searches for large-scale anisotropies. We present analyses of our Xmax_{max} data and show how it can be interpreted in terms of mass composition. We also describe some new analyses that extract mass sensitive parameters from the 100% duty cycle SD data. A coherent interpretation of all these recent results opens new directions. The consequences regarding the cosmic ray composition and the properties of UHECR sources are briefly discussed.Comment: 9 pages, 12 figures, talk given at the 33rd International Cosmic Ray Conference, Rio de Janeiro 201
    corecore