620 research outputs found
TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting
Existing volumetric methods for predicting 3D human pose estimation are
accurate, but computationally expensive and optimized for single time-step
prediction. We present TEMPO, an efficient multi-view pose estimation model
that learns a robust spatiotemporal representation, improving pose accuracy
while also tracking and forecasting human pose. We significantly reduce
computation compared to the state-of-the-art by recurrently computing
per-person 2D pose features, fusing both spatial and temporal information into
a single representation. In doing so, our model is able to use spatiotemporal
context to predict more accurate human poses without sacrificing efficiency. We
further use this representation to track human poses over time as well as
predict future poses. Finally, we demonstrate that our model is able to
generalize across datasets without scene-specific fine-tuning. TEMPO achieves
10 better MPJPE with a 33 improvement in FPS compared to TesseTrack
on the challenging CMU Panoptic Studio dataset.Comment: Accepted at ICCV 202
Predicting Future Instance Segmentation by Forecasting Convolutional Features
Anticipating future events is an important prerequisite towards intelligent
behavior. Video forecasting has been studied as a proxy task towards this goal.
Recent work has shown that to predict semantic segmentation of future frames,
forecasting at the semantic level is more effective than forecasting RGB frames
and then segmenting these. In this paper we consider the more challenging
problem of future instance segmentation, which additionally segments out
individual objects. To deal with a varying number of output labels per image,
we develop a predictive model in the space of fixed-sized convolutional
features of the Mask R-CNN instance segmentation model. We apply the "detection
head'" of Mask R-CNN on the predicted features to produce the instance
segmentation of future frames. Experiments show that this approach
significantly improves over strong baselines based on optical flow and
repurposed instance segmentation architectures
Dimensionality Reduction, Classification and Reconstruction Problems in Statistical Learning Approaches
Statistical learning theory explores ways of estimating functional dependency from a given collection of data. The specific sub-area of supervised statistical learning covers important models like Perceptron, Support Vector Machines (SVM) and Linear
Discriminant Analysis (LDA). In this paper we review the theory of such models and compare their separating hypersurfaces for extracting group-differences between samples. Classification and reconstruction are the main goals of this comparison. We show recent advances in this topic of research illustrating their application on face and medical image databases.Statistical learning theory explores ways of estimating functional dependency from a given collection of data. The specific sub-area of supervised statistical learning covers important models like Perceptron, Support Vector Machines (SVM) and Linear
Discriminant Analysis (LDA). In this paper we review the theory of such models and compare their separating hypersurfaces for extracting group-differences between samples. Classification and reconstruction are the main goals of this comparison. We show recent advances in this topic of research illustrating their application on face and medical image databases
Survey on Vision-based Path Prediction
Path prediction is a fundamental task for estimating how pedestrians or
vehicles are going to move in a scene. Because path prediction as a task of
computer vision uses video as input, various information used for prediction,
such as the environment surrounding the target and the internal state of the
target, need to be estimated from the video in addition to predicting paths.
Many prediction approaches that include understanding the environment and the
internal state have been proposed. In this survey, we systematically summarize
methods of path prediction that take video as input and and extract features
from the video. Moreover, we introduce datasets used to evaluate path
prediction methods quantitatively.Comment: DAPI 201
CAR-Net: Clairvoyant Attentive Recurrent Network
We present an interpretable framework for path prediction that leverages
dependencies between agents' behaviors and their spatial navigation
environment. We exploit two sources of information: the past motion trajectory
of the agent of interest and a wide top-view image of the navigation scene. We
propose a Clairvoyant Attentive Recurrent Network (CAR-Net) that learns where
to look in a large image of the scene when solving the path prediction task.
Our method can attend to any area, or combination of areas, within the raw
image (e.g., road intersections) when predicting the trajectory of the agent.
This allows us to visualize fine-grained semantic elements of navigation scenes
that influence the prediction of trajectories. To study the impact of space on
agents' trajectories, we build a new dataset made of top-view images of
hundreds of scenes (Formula One racing tracks) where agents' behaviors are
heavily influenced by known areas in the images (e.g., upcoming turns). CAR-Net
successfully attends to these salient regions. Additionally, CAR-Net reaches
state-of-the-art accuracy on the standard trajectory forecasting benchmark,
Stanford Drone Dataset (SDD). Finally, we show CAR-Net's ability to generalize
to unseen scenes.Comment: The 2nd and 3rd authors contributed equall
Recommended from our members
Charge distribution and electroluminescence in cross-linked polyethylene under dc field
The intent of this paper is to cross-correlate the information obtained by space charge distribution analysis and electroluminescence (EL) detection in cross-linked polyethylene samples submitted to dc fields, with the objective to make a link between space charge phenomena and energy release as revealed by the detection of visible photons. Space charge measurements carried out at different field levels by the pulsed electro-acoustic method show the presence of a low-field threshold, close to 15-20 kV mm-1, above which considerable space charge begins to accumulate in the insulation. Charges are seen to cross the insulation thickness through a packet-like behaviour at higher fields, starting at about 60-70 kV mm-1. EL measurements show the existence of two distinct thresholds, one related to the continuous excitation of EL under voltage, the other being transient EL detected upon specimen short circuit. The former occurs at values of field corresponding to charge packet formation and the latter to the onset of space charge accumulation. The correspondence between pertinent values of the electric field obtained through space charge and EL analyses provides support for the existence of degradation thresholds in insulating materials. Special emphasis is given to the relationship between charge packet formation and propagation, and EL. Although the two phenomena are observed in the same field range, it is found that the onset of continuous EL follows the formation at the electrodes of positive and negative space charge regions that extend into the bulk prior to the propagation of charge packets. Charge recombination appears to be the excitation process of EL since oppositely charged domains meet in the material bulk. To gain an insight into specific light-excitation processes associated with charge packet propagation, EL has been recorded for several hours under fields at which charge packet dynamics were evidenced. It is shown that current and luminescence oscillations are detected during charge packet propagation, and that they are in phase. The mechanisms underlying EL and charge packets are further considered on the basis of these results
Mhealth interventions to address physical activity and sedentary behavior in cancer survivors: A systematic review
This review aimed to identify, evaluate, and synthesize the scientific literature on mobile health (mHealth) interventions to promote physical activity (PA) or reduce sedentary behavior (SB) in cancer survivors. We searched six databases from 2000 to 13 April 2020 for controlled and non-controlled trials published in any language. We conducted best evidence syntheses on controlled trials to assess the strength of the evidence. All 31 interventions included in this review measured PA outcomes, with 10 of them also evaluating SB outcomes. Most study participants were adults/older adults with various cancer types. The majority (n = 25) of studies implemented multi-component interventions, with activity trackers being the most commonly used mHealth technol-ogy. There is strong evidence for mHealth interventions, including personal contact components, in increasing moderate-to-vigorous intensity PA among cancer survivors. However, there is inconclusive evidence to support mHealth interventions in increasing total activity and step counts. There is inconclusive evidence on SB potentially due to the limited number of studies. mHealth interventions that include personal contact components are likely more effective in increasing PA than mHealth interventions without such components. Future research should address social factors in mHealth interventions for PA and SB in cancer survivors
Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
We address the challenging task of anticipating human-object interaction in
first person videos. Most existing methods ignore how the camera wearer
interacts with the objects, or simply consider body motion as a separate
modality. In contrast, we observe that the international hand movement reveals
critical information about the future activity. Motivated by this, we adopt
intentional hand movement as a future representation and propose a novel deep
network that jointly models and predicts the egocentric hand motion,
interaction hotspots and future action. Specifically, we consider the future
hand motion as the motor attention, and model this attention using latent
variables in our deep model. The predicted motor attention is further used to
characterise the discriminative spatial-temporal visual features for predicting
actions and interaction hotspots. We present extensive experiments
demonstrating the benefit of the proposed joint model. Importantly, our model
produces new state-of-the-art results for action anticipation on both EGTEA
Gaze+ and the EPIC-Kitchens datasets. Our project page is available at
https://aptx4869lm.github.io/ForecastingHOI
- …