1,205 research outputs found

    Learning Fashion Compatibility with Bidirectional LSTMs

    Full text link
    The ubiquity of online fashion shopping demands effective recommendation services for customers. In this paper, we study two types of fashion recommendation: (i) suggesting an item that matches existing components in a set to form a stylish outfit (a collection of fashion items), and (ii) generating an outfit with multimodal (images/text) specifications from a user. To this end, we propose to jointly learn a visual-semantic embedding and the compatibility relationships among fashion items in an end-to-end fashion. More specifically, we consider a fashion outfit to be a sequence (usually from top to bottom and then accessories) and each item in the outfit as a time step. Given the fashion items in an outfit, we train a bidirectional LSTM (Bi-LSTM) model to sequentially predict the next item conditioned on previous ones to learn their compatibility relationships. Further, we learn a visual-semantic space by regressing image features to their semantic representations aiming to inject attribute and category information as a regularization for training the LSTM. The trained network can not only perform the aforementioned recommendations effectively but also predict the compatibility of a given outfit. We conduct extensive experiments on our newly collected Polyvore dataset, and the results provide strong qualitative and quantitative evidence that our framework outperforms alternative methods.Comment: ACM MM 1

    Query-LIFE: Query-aware Language Image Fusion Embedding for E-Commerce Relevance

    Full text link
    Relevance module plays a fundamental role in e-commerce search as they are responsible for selecting relevant products from thousands of items based on user queries, thereby enhancing users experience and efficiency. The traditional approach models the relevance based product titles and queries, but the information in titles alone maybe insufficient to describe the products completely. A more general optimization approach is to further leverage product image information. In recent years, vision-language pre-training models have achieved impressive results in many scenarios, which leverage contrastive learning to map both textual and visual features into a joint embedding space. In e-commerce, a common practice is to fine-tune on the pre-trained model based on e-commerce data. However, the performance is sub-optimal because the vision-language pre-training models lack of alignment specifically designed for queries. In this paper, we propose a method called Query-LIFE (Query-aware Language Image Fusion Embedding) to address these challenges. Query-LIFE utilizes a query-based multimodal fusion to effectively incorporate the image and title based on the product types. Additionally, it employs query-aware modal alignment to enhance the accuracy of the comprehensive representation of products. Furthermore, we design GenFilt, which utilizes the generation capability of large models to filter out false negative samples and further improve the overall performance of the contrastive learning task in the model. Experiments have demonstrated that Query-LIFE outperforms existing baselines. We have conducted ablation studies and human evaluations to validate the effectiveness of each module within Query-LIFE. Moreover, Query-LIFE has been deployed on Miravia Search, resulting in improved both relevance and conversion efficiency

    Hermes International

    Get PDF
    Hermes International Company is the second strongest brand in luxury industry. They have spent more than 100 years to make Hermes name as famous as today. With 3,000 million euro in average revenue, Hermes has a very strong financial position and potential growth compare to other competitors. Their products are always well-designed, high quality, and fashionable. They do not make good products but extremely good ones. Hermes’ target customers are wealthy people who have high level of income and are interested in luxury items. With more than 300 stores all over the world, Hermes is working on their ways to expand their distribution network and make their products available more in different markets. Besides the threats of strong competitors such as Gucci or Louis Vuitton, Hermes International has to deal with replica products which are selling with a huge amount in black market. These duplicate products have negative effect to Hermes not only in sales revenue but also in brand image. For now, all luxury brands are still looking for the solution for this problem. With a great system of management, Hermes has achieved a lot of success. Patrick Thomas’s leadership helped Hermes expand a lot in Asian countries and will expand more in the near future. He also decided to have their own crocodile farm which reduces cost of raw materials and threat from suppliers. Due to using massive amount of crocodile’s skin, Hermes also has unsolved problem with animal protection organizations. This unethical production will also harm to Hermes’s image and create negative reaction from some parts of consumers

    Visual Search at eBay

    Full text link
    In this paper, we propose a novel end-to-end approach for scalable visual search infrastructure. We discuss the challenges we faced for a massive volatile inventory like at eBay and present our solution to overcome those. We harness the availability of large image collection of eBay listings and state-of-the-art deep learning techniques to perform visual search at scale. Supervised approach for optimized search limited to top predicted categories and also for compact binary signature are key to scale up without compromising accuracy and precision. Both use a common deep neural network requiring only a single forward inference. The system architecture is presented with in-depth discussions of its basic components and optimizations for a trade-off between search relevance and latency. This solution is currently deployed in a distributed cloud infrastructure and fuels visual search in eBay ShopBot and Close5. We show benchmark on ImageNet dataset on which our approach is faster and more accurate than several unsupervised baselines. We share our learnings with the hope that visual search becomes a first class citizen for all large scale search engines rather than an afterthought.Comment: To appear in 23rd SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2017. A demonstration video can be found at https://youtu.be/iYtjs32vh4
    • …
    corecore