4 research outputs found
Multi-modal embedding for main product detection in fashion
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Best Paper Award a la 2017 IEEE International Conference on Computer Vision WorkshopsWe present an approach to detect the main product in fashion images by exploiting the textual metadata associated with each image. Our approach is based on a Convolutional Neural Network and learns a joint embedding of object proposals and textual metadata to predict the main product in the image. We additionally use several complementary classification and overlap losses in order to improve training stability and performance. Our tests on a large-scale dataset taken from eight e-commerce sites show that our approach outperforms strong baselines and is able to accurately detect the main product in a wide diversity of challenging fashion images.Peer ReviewedAward-winningPostprint (author's final draft
A Federated Approach for Fine-Grained Classification of Fashion Apparel
As online retail services proliferate and are pervasive in modern lives,
applications for classifying fashion apparel features from image data are
becoming more indispensable. Online retailers, from leading companies to
start-ups, can leverage such applications in order to increase profit margin
and enhance the consumer experience. Many notable schemes have been proposed to
classify fashion items, however, the majority of which focused upon classifying
basic-level categories, such as T-shirts, pants, skirts, shoes, bags, and so
forth. In contrast to most prior efforts, this paper aims to enable an in-depth
classification of fashion item attributes within the same category. Beginning
with a single dress, we seek to classify the type of dress hem, the hem length,
and the sleeve length. The proposed scheme is comprised of three major stages:
(a) localization of a target item from an input image using semantic
segmentation, (b) detection of human key points (e.g., point of shoulder) using
a pre-trained CNN and a bounding box, and (c) three phases to classify the
attributes using a combination of algorithmic approaches and deep neural
networks. The experimental results demonstrate that the proposed scheme is
highly effective, with all categories having average precision of above 93.02%,
and outperforms existing Convolutional Neural Networks (CNNs)-based schemes.Comment: 11 pages, 4 figures, 5 tables, submitted to IEEE ACCESS (under
review
Multi-modal embedding for main product detection in fashion
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Best Paper Award a la 2017 IEEE International Conference on Computer Vision WorkshopsWe present an approach to detect the main product in fashion images by exploiting the textual metadata associated with each image. Our approach is based on a Convolutional Neural Network and learns a joint embedding of object proposals and textual metadata to predict the main product in the image. We additionally use several complementary classification and overlap losses in order to improve training stability and performance. Our tests on a large-scale dataset taken from eight e-commerce sites show that our approach outperforms strong baselines and is able to accurately detect the main product in a wide diversity of challenging fashion images.Peer ReviewedAward-winnin