2,306 research outputs found
Creating Capsule Wardrobes from Fashion Images
We propose to automatically create capsule wardrobes. Given an inventory of
candidate garments and accessories, the algorithm must assemble a minimal set
of items that provides maximal mix-and-match outfits. We pose the task as a
subset selection problem. To permit efficient subset selection over the space
of all outfit combinations, we develop submodular objective functions capturing
the key ingredients of visual compatibility, versatility, and user-specific
preference. Since adding garments to a capsule only expands its possible
outfits, we devise an iterative approach to allow near-optimal submodular
function maximization. Finally, we present an unsupervised approach to learn
visual compatibility from "in the wild" full body outfit photos; the
compatibility metric translates well to cleaner catalog photos and improves
over existing methods. Our results on thousands of pieces from popular fashion
websites show that automatic capsule creation has potential to mimic skilled
fashionistas in assembling flexible wardrobes, while being significantly more
scalable.Comment: Accepted to CVPR 201
Single-Item Fashion Recommender: Towards Cross-Domain Recommendations
Nowadays, recommender systems and search engines play an integral role in
fashion e-commerce. Still, many challenges lie ahead, and this study tries to
tackle some. This article first suggests a content-based fashion recommender
system that uses a parallel neural network to take a single fashion item shop
image as input and make in-shop recommendations by listing similar items
available in the store. Next, the same structure is enhanced to personalize the
results based on user preferences. This work then introduces a background
augmentation technique that makes the system more robust to out-of-domain
queries, enabling it to make street-to-shop recommendations using only a
training set of catalog shop images. Moreover, the last contribution of this
paper is a new evaluation metric for recommendation tasks called
objective-guided human score. This method is an entirely customizable framework
that produces interpretable, comparable scores from subjective evaluations of
human scorers.Comment: 5 Pages, 6 Figures, 1 Tabl
The Scientist, Winter 2009
https://scholarworks.sjsu.edu/scientist/1003/thumbnail.jp
Main product detection with graph networks for fashion
Altres ajuts: acord transformatiu CRUE-CSICAltres ajuts: Industrial Doctorate Grant 2016 DI 039Computer vision has established a foothold in the online fashion retail industry. Main product detection is a crucial step of vision-based fashion product feed parsing pipelines, focused on identifying the bounding boxes that contain the product being sold in the gallery of images of the product page. The current state-of-the-art approach does not leverage the relations between regions in the image, and treats images of the same product independently, therefore not fully exploiting visual and product contextual information. In this paper, we propose a model that incorporates Graph Convolutional Networks (GCN) that jointly represent all detected bounding boxes in the gallery as nodes. We show that the proposed method is better than the state-of-the-art, especially, when we consider the scenario where title-input is missing at inference time and for cross-dataset evaluation, our method outperforms previous approaches by a large margin
ImageNet Large Scale Visual Recognition Challenge
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in
object category classification and detection on hundreds of object categories
and millions of images. The challenge has been run annually from 2010 to
present, attracting participation from more than fifty institutions.
This paper describes the creation of this benchmark dataset and the advances
in object recognition that have been possible as a result. We discuss the
challenges of collecting large-scale ground truth annotation, highlight key
breakthroughs in categorical object recognition, provide a detailed analysis of
the current state of the field of large-scale image classification and object
detection, and compare the state-of-the-art computer vision accuracy with human
accuracy. We conclude with lessons learned in the five years of the challenge,
and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL
VOC (per-category comparisons in Table 3, distribution of localization
difficulty in Fig 16), a list of queries used for obtaining object detection
images (Appendix C), and some additional reference
Mobility is the Message: Experiments with Mobile Media Sharing
This thesis explores new mobile media sharing applications by building, deploying, and studying their use. While we share media in many different ways both on the web and on mobile phones, there are few ways of sharing media with people physically near us. Studied were three designed and built systems: Push!Music, Columbus, and Portrait Catalog, as well as a fourth commercially available system â Foursquare. This thesis offers four contributions: First, it explores the design space of co-present media sharing of four test systems. Second, through user studies of these systems it reports on how these come to be used. Third, it explores new ways of conducting trials as the technical mobile landscape has changed. Last, we look at how the technical solutions demonstrate different lines of thinking from how similar solutions might look today.
Through a Human-Computer Interaction methodology of design, build, and study, we look at systems through the eyes of embodied interaction and examine how the systems come to be in use. Using Goffmanâs understanding of social order, we see how these mobile media sharing systems allow people to actively present themselves through these media. In turn, using McLuhanâs way of understanding media, we reflect on how these new systems enable a new type of medium distinct from the web centric media, and how this relates directly to mobility.
While media sharing is something that takes place everywhere in western society, it is still tied to the way media is shared through computers. Although often mobile, they do not consider the mobile settings. The systems in this thesis treat mobility as an opportunity for design. It is still left to see how this mobile media sharing will come to present itself in peopleâs everyday life, and when it does, how we will come to understand it and how it will transform society as a medium distinct from those before. This thesis gives a glimpse at what this future will look like
- âŠ