612 research outputs found
Learning Temporal Transformations From Time-Lapse Videos
Based on life-long observations of physical, chemical, and biologic phenomena
in the natural world, humans can often easily picture in their minds what an
object will look like in the future. But, what about computers? In this paper,
we learn computational models of object transformations from time-lapse videos.
In particular, we explore the use of generative models to create depictions of
objects at future times. These models explore several different prediction
tasks: generating a future state given a single depiction of an object,
generating a future state given two depictions of an object at different times,
and generating future states recursively in a recurrent framework. We provide
both qualitative and quantitative evaluations of the generated results, and
also conduct a human evaluation to compare variations of our models.Comment: ECCV201
The Present and Future of Museum Accessibility for People with Visual Impairments
People with visual impairments (PVI) have shown interest in visiting museums and enjoying visual art. Based on this knowledge, some museums provide tactile reproductions of artworks, specialized tours for PVI, or enable them to schedule accessible visits. However, the ability of PVI to visit museums is still dependent on the assistance they get from their family and friends or from the museum personnel. In this paper, we surveyed 19 PVI to understand their opinions and expectations about visiting museums independently, as well as the requirements of user interfaces to support it. Moreover, we increase the knowledge about the previous experiences, motivations and accessibility issues of PVI in museums
Dimensionality Reduction, Classification and Reconstruction Problems in Statistical Learning Approaches
Statistical learning theory explores ways of estimating functional dependency from a given collection of data. The specific sub-area of supervised statistical learning covers important models like Perceptron, Support Vector Machines (SVM) and Linear
Discriminant Analysis (LDA). In this paper we review the theory of such models and compare their separating hypersurfaces for extracting group-differences between samples. Classification and reconstruction are the main goals of this comparison. We show recent advances in this topic of research illustrating their application on face and medical image databases.Statistical learning theory explores ways of estimating functional dependency from a given collection of data. The specific sub-area of supervised statistical learning covers important models like Perceptron, Support Vector Machines (SVM) and Linear
Discriminant Analysis (LDA). In this paper we review the theory of such models and compare their separating hypersurfaces for extracting group-differences between samples. Classification and reconstruction are the main goals of this comparison. We show recent advances in this topic of research illustrating their application on face and medical image databases
Survey on Vision-based Path Prediction
Path prediction is a fundamental task for estimating how pedestrians or
vehicles are going to move in a scene. Because path prediction as a task of
computer vision uses video as input, various information used for prediction,
such as the environment surrounding the target and the internal state of the
target, need to be estimated from the video in addition to predicting paths.
Many prediction approaches that include understanding the environment and the
internal state have been proposed. In this survey, we systematically summarize
methods of path prediction that take video as input and and extract features
from the video. Moreover, we introduce datasets used to evaluate path
prediction methods quantitatively.Comment: DAPI 201
CAR-Net: Clairvoyant Attentive Recurrent Network
We present an interpretable framework for path prediction that leverages
dependencies between agents' behaviors and their spatial navigation
environment. We exploit two sources of information: the past motion trajectory
of the agent of interest and a wide top-view image of the navigation scene. We
propose a Clairvoyant Attentive Recurrent Network (CAR-Net) that learns where
to look in a large image of the scene when solving the path prediction task.
Our method can attend to any area, or combination of areas, within the raw
image (e.g., road intersections) when predicting the trajectory of the agent.
This allows us to visualize fine-grained semantic elements of navigation scenes
that influence the prediction of trajectories. To study the impact of space on
agents' trajectories, we build a new dataset made of top-view images of
hundreds of scenes (Formula One racing tracks) where agents' behaviors are
heavily influenced by known areas in the images (e.g., upcoming turns). CAR-Net
successfully attends to these salient regions. Additionally, CAR-Net reaches
state-of-the-art accuracy on the standard trajectory forecasting benchmark,
Stanford Drone Dataset (SDD). Finally, we show CAR-Net's ability to generalize
to unseen scenes.Comment: The 2nd and 3rd authors contributed equall
- …