788,973 research outputs found
Multiple path prediction for traffic scenes using LSTMs and mixture density models
This work presents an analysis of predicting multiple future paths of moving objects in traffic scenes by leveraging Long Short-Term Memory architectures (LSTMs) and Mixture Density Networks (MDNs) in a single-shot manner. Path prediction allows estimating the future positions of objects. This is useful in important applications such as security monitoring systems, Autonomous Driver Assistance Systems and assistive technologies. Normal approaches use observed positions (tracklets) of objects in video frames to predict their future paths as a sequence of position values. This can be treated as a time series. LSTMs have achieved good performance when dealing with time series. However, LSTMs have the limitation of only predicting a single path per tracklet. Path prediction is not a deterministic task and requires predicting with a level of uncertainty. Predicting multiple paths instead of a single one is therefore a more realistic manner of approaching this task. In this work, predicting a set of future paths with associated uncertainty was archived by combining LSTMs and MDNs. The evaluation was made on the KITTI and the CityFlow datasets on three type of objects, four prediction horizons and two different points of view (image coordinates and birds-eye vie
Making history: intentional capture of future memories
Lifelogging' technology makes it possible to amass digital data about every aspect of our everyday lives. Instead of focusing on such technical possibilities, here we investigate the way people compose long-term mnemonic representations of their lives. We asked 10 families to create a time capsule, a collection of objects used to trigger remembering in the distant future. Our results show that contrary to the lifelogging view, people are less interested in exhaustively digitally recording their past than in reconstructing it from carefully selected cues that are often physical objects. Time capsules were highly expressive and personal, many objects were made explicitly for inclusion, however with little object annotation. We use these findings to propose principles for designing technology that supports the active reconstruction of our future past
Anticipating Visual Representations from Unlabeled Video
Anticipating actions and objects before they start or appear is a difficult
problem in computer vision with several real-world applications. This task is
challenging partly because it requires leveraging extensive knowledge of the
world that is difficult to write down. We believe that a promising resource for
efficiently learning this knowledge is through readily available unlabeled
video. We present a framework that capitalizes on temporal structure in
unlabeled video to learn to anticipate human actions and objects. The key idea
behind our approach is that we can train deep networks to predict the visual
representation of images in the future. Visual representations are a promising
prediction target because they encode images at a higher semantic level than
pixels yet are automatic to compute. We then apply recognition algorithms on
our predicted representation to anticipate objects and actions. We
experimentally validate this idea on two datasets, anticipating actions one
second in the future and objects five seconds in the future.Comment: CVPR 201
- …