1,256 research outputs found
Skip-Thought Vectors
We describe an approach for unsupervised learning of a generic, distributed
sentence encoder. Using the continuity of text from books, we train an
encoder-decoder model that tries to reconstruct the surrounding sentences of an
encoded passage. Sentences that share semantic and syntactic properties are
thus mapped to similar vector representations. We next introduce a simple
vocabulary expansion method to encode words that were not seen as part of
training, allowing us to expand our vocabulary to a million words. After
training our model, we extract and evaluate our vectors with linear models on 8
tasks: semantic relatedness, paraphrase detection, image-sentence ranking,
question-type classification and 4 benchmark sentiment and subjectivity
datasets. The end result is an off-the-shelf encoder that can produce highly
generic sentence representations that are robust and perform well in practice.
We will make our encoder publicly available.Comment: 11 page
What are the shapes of response time distributions in visual search?
Many visual search experiments measure response time (RT) as their primary dependent variable. Analyses typically focus on mean (or median) RT. However, given enough data, the RT distribution can be a rich source of information. For this paper, we collected about 500 trials per cell per observer for both target-present and target-absent displays in each of three classic search tasks: feature search, with the target defined by color; conjunction search, with the target defined by both color and orientation; and spatial configuration search for a 2 among distractor 5s. This large data set allows us to characterize the RT distributions in detail. We present the raw RT distributions and fit several psychologically motivated functions (ex-Gaussian, ex-Wald, Gamma, and Weibull) to the data. We analyze and interpret parameter trends from these four functions within the context of theories of visual search
A Tree-Based Context Model for Object Recognition
There has been a growing interest in exploiting contextual information in addition to local features to detect and localize multiple object categories in an image. A context model can rule out some unlikely combinations or locations of objects and guide detectors to produce a semantically coherent interpretation of a scene. However, the performance benefit of context models has been limited because most of the previous methods were tested on datasets with only a few object categories, in which most images contain one or two object categories. In this paper, we introduce a new dataset with images that contain many instances of different object categories, and propose an efficient model that captures the contextual information among more than a hundred object categories using a tree structure. Our model incorporates global image features, dependencies between object categories, and outputs of local detectors into one probabilistic framework. We demonstrate that our context model improves object recognition performance and provides a coherent interpretation of a scene, which enables a reliable image querying system by multiple object categories. In addition, our model can be applied to scene understanding tasks that local detectors alone cannot solve, such as detecting objects out of context or querying for the most typical and the least typicalscenes in a dataset.This research was partially funded by Shell International Exploration and Production Inc., by Army Research Office under award W911NF-06-1-0076, by NSF Career Award (ISI 0747120), and by the Air Force Office of Scientific Research under Award No.FA9550-06-1-0324. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the Air Force
SegICP: Integrated Deep Semantic Segmentation and Pose Estimation
Recent robotic manipulation competitions have highlighted that sophisticated
robots still struggle to achieve fast and reliable perception of task-relevant
objects in complex, realistic scenarios. To improve these systems' perceptive
speed and robustness, we present SegICP, a novel integrated solution to object
recognition and pose estimation. SegICP couples convolutional neural networks
and multi-hypothesis point cloud registration to achieve both robust pixel-wise
semantic segmentation as well as accurate and real-time 6-DOF pose estimation
for relevant objects. Our architecture achieves 1cm position error and
<5^\circ$ angle error in real time without an initial seed. We evaluate and
benchmark SegICP against an annotated dataset generated by motion capture.Comment: IROS camera-read
EVALUATION OF APACHE II, SAPS II AND SOFA AS PREDICTORS OF MORTALITY IN PATIENTS OVER 80 YEARS ADMITTED TO ICU
Modelling search for people in 900 scenes: A combined source model of eye guidance
How predictable are human eye movements during search in real world scenes? We recorded 14 observers’ eye movements as they performed a search task (person detection) in 912 outdoor scenes. Observers were highly consistent in the regions fixated during search, even when the target was absent from the scene. These eye movements were used to evaluate computational models of search guidance from three sources: Saliency, target features, and scene context. Each of these models independently outperformed a cross-image control in predicting human fixations. Models that combined sources of guidance ultimately predicted 94% of human agreement, with the scene context component providing the most explanatory power. None of the models, however, could reach the precision and fidelity of an attentional map defined by human fixations. This work puts forth a benchmark for computational models of search in real world scenes. Further improvements in modelling should capture mechanisms underlying the selectivity of observers’ fixations during search.National Eye Institute (Integrative Training Program in Vision grant T32 EY013935)Massachusetts Institute of Technology (Singleton Graduate Research Fellowship)National Science Foundation (U.S.) (Graduate Research Fellowship)National Science Foundation (U.S.) (CAREER Award (0546262))National Science Foundation (U.S.) (NSF contract (0705677))National Science Foundation (U.S.) (Career Award (0747120)
Accidental Pinhole and Pinspeck Cameras
We identify and study two types of “accidental” images that can be formed in scenes. The first is an accidental pinhole camera image. The second class of accidental images are “inverse” pinhole camera images, formed by subtracting an image with a small occluder present from a reference image without the occluder. Both types of accidental cameras happen in a variety of different situations. For example, an indoor scene illuminated by natural light, a street with a person walking under the shadow of a building, etc. The images produced by accidental cameras are often mistaken for shadows or interreflections. However, accidental images can reveal information about the scene outside the image, the lighting conditions, or the aperture by which light enters the scene.National Science Foundation (U.S.) (CAREER Award 0747120)United States. Office of Naval Research. Multidisciplinary University Research Initiative (N000141010933)National Science Foundation (U.S.) (CGV 1111415)National Science Foundation (U.S.) (CGV 0964004
What have we learnt from EUPORIAS climate service prototypes?
The international effort toward climate services, epitomised by the development of the Global Framework for Climate Services and, more recently the launch of Copernicus Climate Change Service has renewed interest in the users and the role they can play in shaping the services they will eventually use. Here we critically analyse the results of the five climate service prototypes that were developed as part of the EU funded project EUPORIAS.
Starting from the experience acquired in each of the projects we attempt to distil a few key lessons which, we believe, will be relevant to the wider community of climate service developers
ImageNet Large Scale Visual Recognition Challenge
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in
object category classification and detection on hundreds of object categories
and millions of images. The challenge has been run annually from 2010 to
present, attracting participation from more than fifty institutions.
This paper describes the creation of this benchmark dataset and the advances
in object recognition that have been possible as a result. We discuss the
challenges of collecting large-scale ground truth annotation, highlight key
breakthroughs in categorical object recognition, provide a detailed analysis of
the current state of the field of large-scale image classification and object
detection, and compare the state-of-the-art computer vision accuracy with human
accuracy. We conclude with lessons learned in the five years of the challenge,
and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL
VOC (per-category comparisons in Table 3, distribution of localization
difficulty in Fig 16), a list of queries used for obtaining object detection
images (Appendix C), and some additional reference
Highlights from the Pierre Auger Observatory
The Pierre Auger Observatory is the world's largest cosmic ray observatory.
Our current exposure reaches nearly 40,000 km str and provides us with an
unprecedented quality data set. The performance and stability of the detectors
and their enhancements are described. Data analyses have led to a number of
major breakthroughs. Among these we discuss the energy spectrum and the
searches for large-scale anisotropies. We present analyses of our X
data and show how it can be interpreted in terms of mass composition. We also
describe some new analyses that extract mass sensitive parameters from the 100%
duty cycle SD data. A coherent interpretation of all these recent results opens
new directions. The consequences regarding the cosmic ray composition and the
properties of UHECR sources are briefly discussed.Comment: 9 pages, 12 figures, talk given at the 33rd International Cosmic Ray
Conference, Rio de Janeiro 201
- …
