20 research outputs found
Efficient Large-Scale Visual Representation Learning
In this article, we present our approach to single-modality visual
representation learning. Understanding visual representations of product
content is vital for recommendations, search, and advertising applications in
e-commerce. We detail and contrast techniques used to fine-tune large-scale
visual representation learning models in an efficient manner under low-resource
settings, including several pretrained backbone architectures, both in the
convolutional neural network as well as the vision transformer family. We
highlight the challenges for e-commerce applications at-scale and highlight the
efforts to more efficiently train, evaluate, and serve visual representations.
We present ablation studies evaluating the representation offline performance
for several downstream tasks, including our visually similar ad
recommendations. To this end, we present a novel text-to-image generative
offline evaluation method for visually similar recommendation systems. Finally,
we include online results from deployed machine learning systems in production
at Etsy
Transformer-empowered Multi-modal Item Embedding for Enhanced Image Search in E-Commerce
Over the past decade, significant advances have been made in the field of
image search for e-commerce applications. Traditional image-to-image retrieval
models, which focus solely on image details such as texture, tend to overlook
useful semantic information contained within the images. As a result, the
retrieved products might possess similar image details, but fail to fulfil the
user's search goals. Moreover, the use of image-to-image retrieval models for
products containing multiple images results in significant online product
feature storage overhead and complex mapping implementations. In this paper, we
report the design and deployment of the proposed Multi-modal Item Embedding
Model (MIEM) to address these limitations. It is capable of utilizing both
textual information and multiple images about a product to construct meaningful
product features. By leveraging semantic information from images, MIEM
effectively supplements the image search process, improving the overall
accuracy of retrieval results. MIEM has become an integral part of the Shopee
image search platform. Since its deployment in March 2023, it has achieved a
remarkable 9.90% increase in terms of clicks per user and a 4.23% boost in
terms of orders per user for the image search feature on the Shopee e-commerce
platform.Comment: Accepted by IAAI 202