19,822 research outputs found
What Will I Do Next? The Intention from Motion Experiment
In computer vision, video-based approaches have been widely explored for the
early classification and the prediction of actions or activities. However, it
remains unclear whether this modality (as compared to 3D kinematics) can still
be reliable for the prediction of human intentions, defined as the overarching
goal embedded in an action sequence. Since the same action can be performed
with different intentions, this problem is more challenging but yet affordable
as proved by quantitative cognitive studies which exploit the 3D kinematics
acquired through motion capture systems. In this paper, we bridge cognitive and
computer vision studies, by demonstrating the effectiveness of video-based
approaches for the prediction of human intentions. Precisely, we propose
Intention from Motion, a new paradigm where, without using any contextual
information, we consider instantaneous grasping motor acts involving a bottle
in order to forecast why the bottle itself has been reached (to pass it or to
place in a box, or to pour or to drink the liquid inside). We process only the
grasping onsets casting intention prediction as a classification framework.
Leveraging on our multimodal acquisition (3D motion capture data and 2D optical
videos), we compare the most commonly used 3D descriptors from cognitive
studies with state-of-the-art video-based techniques. Since the two analyses
achieve an equivalent performance, we demonstrate that computer vision tools
are effective in capturing the kinematics and facing the cognitive problem of
human intention prediction.Comment: 2017 IEEE Conference on Computer Vision and Pattern Recognition
Workshop
ADVISE: Symbolism and External Knowledge for Decoding Advertisements
In order to convey the most content in their limited space, advertisements
embed references to outside knowledge via symbolism. For example, a motorcycle
stands for adventure (a positive property the ad wants associated with the
product being sold), and a gun stands for danger (a negative property to
dissuade viewers from undesirable behaviors). We show how to use symbolic
references to better understand the meaning of an ad. We further show how
anchoring ad understanding in general-purpose object recognition and image
captioning improves results. We formulate the ad understanding task as matching
the ad image to human-generated statements that describe the action that the ad
prompts, and the rationale it provides for taking this action. Our proposed
method outperforms the state of the art on this task, and on an alternative
formulation of question-answering on ads. We show additional applications of
our learned representations for matching ads to slogans, and clustering ads
according to their topic, without extra training.Comment: To appear, Proceedings of the European Conference on Computer Vision
(ECCV
Hierarchical Multi-Task Learning Framework for Session-based Recommendations
While session-based recommender systems (SBRSs) have shown superior
recommendation performance, multi-task learning (MTL) has been adopted by SBRSs
to enhance their prediction accuracy and generalizability further. Hierarchical
MTL (H-MTL) sets a hierarchical structure between prediction tasks and feeds
outputs from auxiliary tasks to main tasks. This hierarchy leads to richer
input features for main tasks and higher interpretability of predictions,
compared to existing MTL frameworks. However, the H-MTL framework has not been
investigated in SBRSs yet. In this paper, we propose HierSRec which
incorporates the H-MTL architecture into SBRSs. HierSRec encodes a given
session with a metadata-aware Transformer and performs next-category prediction
(i.e., auxiliary task) with the session encoding. Next, HierSRec conducts
next-item prediction (i.e., main task) with the category prediction result and
session encoding. For scalable inference, HierSRec creates a compact set of
candidate items (e.g., 4% of total items) per test example using the category
prediction. Experiments show that HierSRec outperforms existing SBRSs as per
next-item prediction accuracy on two session-based recommendation datasets. The
accuracy of HierSRec measured with the carefully-curated candidate items aligns
with the accuracy of HierSRec calculated with all items, which validates the
usefulness of our candidate generation scheme via H-MTL.Comment: Accepted at the 6th Workshop on Online Recommender Systems and User
Modeling @ ACM RecSys 202
- …