3,676 research outputs found
Learning Fashion Compatibility with Bidirectional LSTMs
The ubiquity of online fashion shopping demands effective recommendation
services for customers. In this paper, we study two types of fashion
recommendation: (i) suggesting an item that matches existing components in a
set to form a stylish outfit (a collection of fashion items), and (ii)
generating an outfit with multimodal (images/text) specifications from a user.
To this end, we propose to jointly learn a visual-semantic embedding and the
compatibility relationships among fashion items in an end-to-end fashion. More
specifically, we consider a fashion outfit to be a sequence (usually from top
to bottom and then accessories) and each item in the outfit as a time step.
Given the fashion items in an outfit, we train a bidirectional LSTM (Bi-LSTM)
model to sequentially predict the next item conditioned on previous ones to
learn their compatibility relationships. Further, we learn a visual-semantic
space by regressing image features to their semantic representations aiming to
inject attribute and category information as a regularization for training the
LSTM. The trained network can not only perform the aforementioned
recommendations effectively but also predict the compatibility of a given
outfit. We conduct extensive experiments on our newly collected Polyvore
dataset, and the results provide strong qualitative and quantitative evidence
that our framework outperforms alternative methods.Comment: ACM MM 1
Towards a multimedia knowledge-based agent with social competence and human interaction capabilities
We present work in progress on an intelligent embodied conversation agent in the basic care and healthcare domain. In contrast to most of the existing agents, the presented agent is aimed to have linguistic cultural, social and emotional competence needed to interact with elderly and migrants. It is composed of an ontology-based and reasoning-driven dialogue manager, multimodal communication analysis and generation modules and a search engine for the retrieval of multimedia background content from the web needed for conducting a conversation on a given topic.The presented work is funded by the European Commission under the contract number H2020-645012-RIA
- …