3,793 research outputs found

    Transfer Learning via Contextual Invariants for One-to-Many Cross-Domain Recommendation

    Full text link
    The rapid proliferation of new users and items on the social web has aggravated the gray-sheep user/long-tail item challenge in recommender systems. Historically, cross-domain co-clustering methods have successfully leveraged shared users and items across dense and sparse domains to improve inference quality. However, they rely on shared rating data and cannot scale to multiple sparse target domains (i.e., the one-to-many transfer setting). This, combined with the increasing adoption of neural recommender architectures, motivates us to develop scalable neural layer-transfer approaches for cross-domain learning. Our key intuition is to guide neural collaborative filtering with domain-invariant components shared across the dense and sparse domains, improving the user and item representations learned in the sparse domains. We leverage contextual invariances across domains to develop these shared modules, and demonstrate that with user-item interaction context, we can learn-to-learn informative representation spaces even with sparse interaction data. We show the effectiveness and scalability of our approach on two public datasets and a massive transaction dataset from Visa, a global payments technology company (19% Item Recall, 3x faster vs. training separate models for each domain). Our approach is applicable to both implicit and explicit feedback settings.Comment: SIGIR 202

    Using K-means Clustering and Similarity Measure to Deal with Missing Rating in Collaborative Filtering Recommendation Systems

    Get PDF
    The Collaborative Filtering recommendation systems have been developed to address the information overload problem and personalize the content to the users for business and organizations. However, the Collaborative Filtering approach has its limitation of data sparsity and online scalability problems which result in low recommendation quality. In this thesis, a novel Collaborative Filtering approach is introduced using clustering and similarity technologies. The proposed method using K-means clustering to partition the entire dataset reduces the time complexity and improves the online scalability as well as the data density. Moreover, the similarity comparison method predicts and fills up the missing value in sparsity dataset to enhance the data density which boosts the recommendation quality. This thesis uses MovieLens dataset to investigate the proposed method, which yields amazing experimental outcome on a large sparsity data set that has a higher quality with lower time complexity than the traditional Collaborative Filtering approaches

    Next Generation of Product Search and Discovery

    Get PDF
    Online shopping has become an important part of people’s daily life with the rapid development of e-commerce. In some domains such as books, electronics, and CD/DVDs, online shopping has surpassed or even replaced the traditional shopping method. Compared with traditional retailing, e-commerce is information intensive. One of the key factors to succeed in e-business is how to facilitate the consumers’ approaches to discover a product. Conventionally a product search engine based on a keyword search or category browser is provided to help users find the product information they need. The general goal of a product search system is to enable users to quickly locate information of interest and to minimize users’ efforts in search and navigation. In this process human factors play a significant role. Finding product information could be a tricky task and may require an intelligent use of search engines, and a non-trivial navigation of multilayer categories. Searching for useful product information can be frustrating for many users, especially those inexperienced users. This dissertation focuses on developing a new visual product search system that effectively extracts the properties of unstructured products, and presents the possible items of attraction to users so that the users can quickly locate the ones they would be most likely interested in. We designed and developed a feature extraction algorithm that retains product color and local pattern features, and the experimental evaluation on the benchmark dataset demonstrated that it is robust against common geometric and photometric visual distortions. Besides, instead of ignoring product text information, we investigated and developed a ranking model learned via a unified probabilistic hypergraph that is capable of capturing correlations among product visual content and textual content. Moreover, we proposed and designed a fuzzy hierarchical co-clustering algorithm for the collaborative filtering product recommendation. Via this method, users can be automatically grouped into different interest communities based on their behaviors. Then, a customized recommendation can be performed according to these implicitly detected relations. In summary, the developed search system performs much better in a visual unstructured product search when compared with state-of-art approaches. With the comprehensive ranking scheme and the collaborative filtering recommendation module, the user’s overhead in locating the information of value is reduced, and the user’s experience of seeking for useful product information is optimized

    Improving cold-start recommendations using item-based stereotypes

    Get PDF
    Recommender systems (RSs) have become key components driving the success of e-commerce and other platforms where revenue and customer satisfaction is dependent on the user’s ability to discover desirable items in large catalogues. As the number of users and items on a platform grows, the computational complexity and the sparsity problem constitute important challenges for any recommendation algorithm. In addition, the most widely studied filtering-based RSs, while effective in providing suggestions for established users and items, are known for their poor performance for the new user and new item (cold-start) problems. Stereotypical modelling of users and items is a promising approach to solving these problems. A stereotype represents an aggregation of the characteristics of the items or users which can be used to create general user or item classes. We propose a set of methodologies for the automatic generation of stereotypes to address the cold-start problem. The novelty of the proposed approach rests on the findings that stereotypes built independently of the user-to-item ratings improve both recommendation metrics and computational performance during cold-start phases. The resulting RS can be used with any machine learning algorithm as a solver, and the improved performance gains due to rate-agnostic stereotypes are orthogonal to the gains obtained using more sophisticated solvers. The paper describes how such item-based stereotypes can be evaluated via a series of statistical tests prior to being used for recommendation. The proposed approach improves recommendation quality under a variety of metrics and significantly reduces the dimension of the recommendation model

    Music feature extraction and analysis through Python

    Get PDF
    En l'era digital, plataformes com Spotify s'han convertit en els principals canals de consum de música, ampliant les possibilitats per analitzar i entendre la música a través de les dades. Aquest projecte es centra en un examen exhaustiu d'un conjunt de dades obtingut de Spotify, utilitzant Python com a eina per a l'extracció i anàlisi de dades. L'objectiu principal es centra en la creació d'aquest conjunt de dades, emfatitzant una àmplia varietat de cançons de diversos subgèneres. La intenció és representar tant el panorama musical més tendenciós i popular com els nínxols, alineant-se amb el concepte de distribució de Cua Llarga, terme popularitzat com a "Long Tail" en anglès, que destaca el potencial de mercat de productes de nínxols amb menor popularitat. A través de l'anàlisi, es posen de manifest patrons en l'evolució de les característiques musicals al llarg de les dècades passades. Canvis en característiques com l'energia, el volum, la capacitat de ball, el positivisme que desprèn una cançó i la seva correlació amb la popularitat sorgeixen del conjunt de dades. Paral·lelament a aquesta anàlisi, es concep un sistema de recomanació musical basat en el contingut del conjunt de dades creat. L'objectiu és connectar cançons, especialment les menys conegudes, amb possibles oients. Aquest projecte ofereix perspectives beneficioses per a entusiastes de la música, científics de dades i professionals de la indústria. Les metodologies implementades i l'anàlisi realitzat presenten un punt de convergència de la ciència de dades i la indústria de la música en el context digital actualEn la era digital, plataformas como Spotify se han convertido en los principales canales de consumo de música, ampliando las posibilidades para analizar y entender la música a través de los datos. Este proyecto se centra en un examen exhaustivo de un conjunto de datos obtenido de Spotify, utilizando Python como herramienta para la extracción y análisis de datos. El objetivo principal se centra en la creación de este conjunto de datos, enfatizando una amplia variedad de canciones de diversos subgéneros. La intención es representar tanto el panorama musical más tendencioso y popular como los nichos, alineándose con el concepto de distribución de Cola Larga, término popularizado como Long Tail en inglés, que destaca el potencial de mercado de productos de nichos con menor popularidad. A través del análisis, se evidencian patrones en la evolución de las características musicales a lo largo de las décadas pasadas. Cambios en características como la energía, el volumen, la capacidad de baile, el positivismo que desprende una canción y su correlación con la popularidad surgen del conjunto de datos. Paralelamente a este análisis, se concibe un sistema de recomendación musical basado en el contenido del conjunto de datos creado. El objetivo es conectar canciones, especialmente las menos conocidas, con posibles oyentes. Este proyecto ofrece perspectivas beneficiosas para entusiastas de la música, científicos de datos y profesionales de la industria. Las metodologías implementadas y el análisis realizado presentan un punto de convergencia de la ciencia de datos y la industria de la música en el contexto digital actualIn the digital era, platforms like Spotify have become the primary channels of music consumption, broadening the possibilities for analyzing and understanding music through data. This project focuses on a comprehensive examination of a dataset sourced from Spotify, with Python as the tool for data extraction and analysis. The primary objective centers around the creation of this dataset, emphasizing a diverse range of songs from various subgenres. The intention is to represent both mainstream and niche musical landscapes, aligning with the Long Tail distribution concept, which highlights the market potential of less popular niche products. Through analysis, patterns in the evolution of musical features over past decades become evident. Shifts in features such as energy, loudness, danceability, and valence and their correlation with popularity emerge from the dataset. Parallel to this analysis is the conceptualization of a music recommendation system based on the content of the data set. The aim is to connect tracks, especially lesser-known ones, with potential listeners. This project provides insights beneficial for music enthusiasts, data scientists, and industry professionals. The methodologies and analyses present a convergence of data science and the music industry in today's digital contex
    • …
    corecore