299 research outputs found
Computational Technologies for Fashion Recommendation: A Survey
Fashion recommendation is a key research field in computational fashion
research and has attracted considerable interest in the computer vision,
multimedia, and information retrieval communities in recent years. Due to the
great demand for applications, various fashion recommendation tasks, such as
personalized fashion product recommendation, complementary (mix-and-match)
recommendation, and outfit recommendation, have been posed and explored in the
literature. The continuing research attention and advances impel us to look
back and in-depth into the field for a better understanding. In this paper, we
comprehensively review recent research efforts on fashion recommendation from a
technological perspective. We first introduce fashion recommendation at a macro
level and analyse its characteristics and differences with general
recommendation tasks. We then clearly categorize different fashion
recommendation efforts into several sub-tasks and focus on each sub-task in
terms of its problem formulation, research focus, state-of-the-art methods, and
limitations. We also summarize the datasets proposed in the literature for use
in fashion recommendation studies to give readers a brief illustration.
Finally, we discuss several promising directions for future research in this
field. Overall, this survey systematically reviews the development of fashion
recommendation research. It also discusses the current limitations and gaps
between academic research and the real needs of the fashion industry. In the
process, we offer a deep insight into how the fashion industry could benefit
from fashion recommendation technologies. the computational technologies of
fashion recommendation
Fashion Compatibility Prediction Using Ensemble Learning
Fashion is important both financially and for self-expression. There are many tasks in the fashion domain which can be addressed with artificial intelligence. The task of fashion compatibility prediction is to determine how well a set of items work together to form an outfit. Two main tasks are typically used to evaluate the performance of a fashion compatibility prediction model β Outfit Compatibility Prediction and Fill in the Blank.
In this work, a compatibility prediction model, which is based on the graph autoencoder, is evaluated. This same model is then used in a homogeneous ensemble learning approach, proposed to improve the compatibility prediction performance. This ensemble learning approach does not outperform the baseline. Finally, several potential approaches are introduced which may be of interest to future researchers
Leveraging Multimodal Features and Item-level User Feedback for Bundle Construction
Automatic bundle construction is a crucial prerequisite step in various
bundle-aware online services. Previous approaches are mostly designed to model
the bundling strategy of existing bundles. However, it is hard to acquire
large-scale well-curated bundle dataset, especially for those platforms that
have not offered bundle services before. Even for platforms with mature bundle
services, there are still many items that are included in few or even zero
bundles, which give rise to sparsity and cold-start challenges in the bundle
construction models. To tackle these issues, we target at leveraging multimodal
features, item-level user feedback signals, and the bundle composition
information, to achieve a comprehensive formulation of bundle construction.
Nevertheless, such formulation poses two new technical challenges: 1) how to
learn effective representations by optimally unifying multiple features, and 2)
how to address the problems of modality missing, noise, and sparsity problems
induced by the incomplete query bundles. In this work, to address these
technical challenges, we propose a Contrastive Learning-enhanced Hierarchical
Encoder method (CLHE). Specifically, we use self-attention modules to combine
the multimodal and multi-item features, and then leverage both item- and
bundle-level contrastive learning to enhance the representation learning, thus
to counter the modality missing, noise, and sparsity problems. Extensive
experiments on four datasets in two application domains demonstrate that our
method outperforms a list of SOTA methods. The code and dataset are available
at https://github.com/Xiaohao-Liu/CLHE
Learning context-aware outfit recommendation
With the rapid development and increasing popularity of online shopping for fashion products, fashion recommendation plays an important role in daily online shopping scenes. Fashion is not only a commodity that is bought and sold but is also a visual language of sign, a nonverbal communication medium that exists between the wearers and viewers in a community. The key to fashion recommendation is to capture the semantics behind customersβ fit feedback as well as fashion visual style. Existing methods have been developed with the item similarity demonstrated by user interactions like ratings and purchases. By identifying user interests, it is efficient to deliver marketing messages to the right customers. Since the style of clothing contains rich visual information such as color and shape, and the shape has symmetrical structure and asymmetrical structure, and users with different backgrounds have different feelings on clothes, therefore affecting their way of dress. In this paper, we propose a new method to model user preference jointly with user review information and image region-level features to make more accurate recommendations. Specifically, the proposed method is based on scene images to learn the compatibility from fashion or interior design images. Extensive experiments have been conducted on several large-scale real-world datasets consisting of millions of users/items and hundreds of millions of interactions. Extensive experiments indicate that the proposed method effectively improves the performance of items prediction as well as of outfits matching
Learning context-aware outfit recommendation
With the rapid development and increasing popularity of online shopping for fashion products, fashion recommendation plays an important role in daily online shopping scenes. Fashion is not only a commodity that is bought and sold but is also a visual language of sign, a nonverbal communication medium that exists between the wearers and viewers in a community. The key to fashion recommendation is to capture the semantics behind customersβ fit feedback as well as fashion visual style. Existing methods have been developed with the item similarity demonstrated by user interactions like ratings and purchases. By identifying user interests, it is efficient to deliver marketing messages to the right customers. Since the style of clothing contains rich visual information such as color and shape, and the shape has symmetrical structure and asymmetrical structure, and users with different backgrounds have different feelings on clothes, therefore affecting their way of dress. In this paper, we propose a new method to model user preference jointly with user review information and image region-level features to make more accurate recommendations. Specifically, the proposed method is based on scene images to learn the compatibility from fashion or interior design images. Extensive experiments have been conducted on several large-scale real-world datasets consisting of millions of users/items and hundreds of millions of interactions. Extensive experiments indicate that the proposed method effectively improves the performance of items prediction as well as of outfits matching
κ°μΈν μ½λ μΆμ²μ μν μμμ± μ¦λ₯ λ° λμ‘° νμ΅
νμλ
Όλ¬Έ(μμ¬) -- μμΈλνκ΅λνμ : 곡과λν μ»΄ν¨ν°κ³΅νλΆ, 2022.2. μ΄μꡬ.Personalized outfit recommendation has recently been in the spotlight with the rapid growth of the online fashion industry. However, recommending outfits has two significant challenges that should be addressed. The first challenge is that outfit recommendation often requires a complex and large model that utilizes visual information, incurring huge memory and time costs. One natural way to mitigate this problem is to compress such a cumbersome model with knowledge distillation (KD) techniques that leverage knowledge from a pretrained teacher model. However, it is hard to apply existing KD approaches in recommender systems (RS) to the outfit recommendation because they require the ranking of all possible outfits while the number of outfits grows exponentially to the number of consisting clothing items. Therefore, we propose a new KD framework for outfit recommendation, called False Negative Distillation (FND), which exploits false-negative information from the teacher model while not requiring the ranking of all candidates. The second challenge is that the explosive number of outfit candidates amplifying the data sparsity problem, often leading to poor outfit representation. To tackle this issue, inspired by the recent success of contrastive learning (CL), we introduce a CL framework for outfit representation learning with two proposed data augmentation methods. Quantitative and qualitative experiments on outfit recommendation datasets demonstrate the effectiveness and soundness of our proposed methods.μ΅κ·Ό μ¨λΌμΈ ν¨μ
μ°μ
μ΄ κΈμ±μ₯νλ©΄μ κ°μΈν μ½λ μΆμ²μ΄ κ°κ΄λ°κ³ μλ€. κ·Έλ¬λ μ½λ μΆμ²μ ν΄κ²°ν΄μΌ ν λ κ°μ§ μ€μν μ±λ¦°μ§κ° μλ€. 첫 λ²μ§Έ μ±λ¦°μ§λ μ½λ μΆμ²μ΄ μ£Όλ‘ μκ° μ 보λ₯Ό νμ©νλ 볡μ‘νκ³ ν° λͺ¨λΈμ νμλ‘ νκΈ° λλ¬Έμ μλΉν μκ°κ³Ό λ©λͺ¨λ¦¬ λΉμ©μ΄ λ°μνλ€λ κ²μ΄λ€. μ΄ λ¬Έμ λ₯Ό μννλ ν κ°μ§ μμ°μ€λ¬μ΄ λ°©λ²μ μ¬μ νλ ¨λ κ΅μ¬ λͺ¨λΈμ μ§μμ νμ©νλ μ§μ μ¦λ₯ κΈ°μ μ μ΄μ©νμ¬ μ΄λ¬ν μ±κ°μ λͺ¨λΈμ μμΆνλ κ²μ΄λ€. κ·Έλ¬λ μΆμ² μμ€ν
μ κΈ°μ‘΄ μ§μ μ¦λ₯ μ κ·Όλ²μ κ°λ₯ν λͺ¨λ μ½λμ μμλ₯Ό νμλ‘ νλ©°, μ½λμ μλ ꡬμ±λλ μμμ μμ λ°λΌ κΈ°νκΈμμ μΌλ‘ μ¦κ°νκΈ° λλ¬Έμ μ½λ μΆμ²μ κΈ°μ‘΄ μ§μ μ¦λ₯ μ κ·Όλ²μ μ μ©νλ κ²μ μλΉν κΉλ€λ‘μ΄ μμ
μ΄λ€. λ°λΌμ μ°λ¦¬λ λͺ¨λ ν보 μ½λμ μμλ₯Ό μꡬνμ§ μμΌλ©΄μ κ΅μ¬ λͺ¨λΈμ μμμ± μ 보λ₯Ό νμ©νλ μμμ± μ¦λ₯λΌλ μλ‘μ΄ μ§μ μ¦λ₯ νλ μμν¬λ₯Ό μ μνλ€. λ λ²μ§Έ μ±λ¦°μ§λ μ½λ ν보μ νλ°μ μΈ μλ‘ μΈν΄ λ°μ΄ν° ν¬μμ± λ¬Έμ κ° μ¦νλμ΄ μ’
μ’
μ½λ νν(representation)μ΄ μ’μ§ μλ€λ κ²μ΄λ€. μ΄ λ¬Έμ λ₯Ό ν΄κ²°νκΈ° μν΄ μ΅κ·Ό λμ‘° νμ΅μ μ±κ³΅μ μκ°μ λ°μ μλ‘μ΄ λ κ°μ§ λ°μ΄ν° μ¦κ° κΈ°λ²μ μ¬μ©νλ μ½λ νν νμ΅μ μν λμ‘° νμ΅ νλ μμν¬λ₯Ό μ μνλ€. μ°λ¦¬λ μ½λ μΆμ² λ°μ΄ν° μΈνΈμ λν μμ λ° μ§μ μ€νμ ν΅ν΄ μ μλ λ°©λ²μ ν¨κ³Όμ νλΉμ±μ 보μΈλ€.Abstract i
Contents ii
List of Tables v
List of Figures vi
1 Introduction 1
2 Related Work 5
2.1 Outfit Recommendation 5
2.2 Knowledge Distillation 6
2.3 Contrastive Learning 6
3 Approach 7
3.1 Background: Computing the Preference Score to an Outfit 8
3.1.1 Set Transformer 9
3.1.2 Preference score prediction 10
3.2 False Negative Distillation 10
3.2.1 Teacher model 10
3.2.2 Student model 11
3.3 Contrastive Learning for Outfits 13
3.3.1 Erase 14
3.3.2 Replace 14
3.4 Final Objective: FND-CL 14
3.5 Profiling Cold Starters 15
3.5.1 Average (avg) 16
3.5.2 Weighted Average (w-avg) 16
4 Experiment 17
4.1 Experimental Design 17
4.1.1 Datasets 17
4.1.2 Evaluation metrics 18
4.1.3 Considered methods 18
4.1.4 Implementation details 19
4.2 Performance Comparison 20
4.3 Performance on Cold Starters 21
4.4 Performance on Hard Negative Outfits 22
4.5 Performance with Different Ξ± 23
4.6 Performance with Different Augmentations 24
4.7 Performance with Different Model Sizes 25
4.8 Performance with Different Batch Sizes 27
4.9 Visualization of the User-Outfit Space 28
5 Conclusion 30
Bibliography 31
A Appendix 37
A.1 Enhancing the Performance of a Teacher Model 37
A.1.1 Teacher-CL 38
A.1.2 Employing Teacher-CL: FND-CL* 39
Abstract (In Korean) 40μ
Dressing as a Whole: Outfit Compatibility Learning Based on Node-wise Graph Neural Networks
With the rapid development of fashion market, the customers' demands of
customers for fashion recommendation are rising. In this paper, we aim to
investigate a practical problem of fashion recommendation by answering the
question "which item should we select to match with the given fashion items and
form a compatible outfit". The key to this problem is to estimate the outfit
compatibility. Previous works which focus on the compatibility of two items or
represent an outfit as a sequence fail to make full use of the complex
relations among items in an outfit. To remedy this, we propose to represent an
outfit as a graph. In particular, we construct a Fashion Graph, where each node
represents a category and each edge represents interaction between two
categories. Accordingly, each outfit can be represented as a subgraph by
putting items into their corresponding category nodes. To infer the outfit
compatibility from such a graph, we propose Node-wise Graph Neural Networks
(NGNN) which can better model node interactions and learn better node
representations. In NGNN, the node interaction on each edge is different, which
is determined by parameters correlated to the two connected nodes. An attention
mechanism is utilized to calculate the outfit compatibility score with learned
node representations. NGNN can not only be used to model outfit compatibility
from visual or textual modality but also from multiple modalities. We conduct
experiments on two tasks: (1) Fill-in-the-blank: suggesting an item that
matches with existing components of outfit; (2) Compatibility prediction:
predicting the compatibility scores of given outfits. Experimental results
demonstrate the great superiority of our proposed method over others.Comment: 11 pages, accepted by the 2019 World Wide Web Conference (WWW-2019
Formalizing Multimedia Recommendation through Multimodal Deep Learning
Recommender systems (RSs) offer personalized navigation experiences on online
platforms, but recommendation remains a challenging task, particularly in
specific scenarios and domains. Multimodality can help tap into richer
information sources and construct more refined user/item profiles for
recommendations. However, existing literature lacks a shared and universal
schema for modeling and solving the recommendation problem through the lens of
multimodality. This work aims to formalize a general multimodal schema for
multimedia recommendation. It provides a comprehensive literature review of
multimodal approaches for multimedia recommendation from the last eight years,
outlines the theoretical foundations of a multimodal pipeline, and demonstrates
its rationale by applying it to selected state-of-the-art approaches. The work
also conducts a benchmarking analysis of recent algorithms for multimedia
recommendation within Elliot, a rigorous framework for evaluating recommender
systems. The main aim is to provide guidelines for designing and implementing
the next generation of multimodal approaches in multimedia recommendation
- β¦