7,567 research outputs found
GraphSAIL: Graph Structure Aware Incremental Learning for Recommender Systems
Given the convenience of collecting information through online services,
recommender systems now consume large scale data and play a more important role
in improving user experience. With the recent emergence of Graph Neural
Networks (GNNs), GNN-based recommender models have shown the advantage of
modeling the recommender system as a user-item bipartite graph to learn
representations of users and items. However, such models are expensive to train
and difficult to perform frequent updates to provide the most up-to-date
recommendations. In this work, we propose to update GNN-based recommender
models incrementally so that the computation time can be greatly reduced and
models can be updated more frequently. We develop a Graph Structure Aware
Incremental Learning framework, GraphSAIL, to address the commonly experienced
catastrophic forgetting problem that occurs when training a model in an
incremental fashion. Our approach preserves a user's long-term preference (or
an item's long-term property) during incremental model updating. GraphSAIL
implements a graph structure preservation strategy which explicitly preserves
each node's local structure, global structure, and self-information,
respectively. We argue that our incremental training framework is the first
attempt tailored for GNN based recommender systems and demonstrate its
improvement compared to other incremental learning techniques on two public
datasets. We further verify the effectiveness of our framework on a large-scale
industrial dataset.Comment: Accepted by CIKM2020 Applied Research Trac
mARC: Memory by Association and Reinforcement of Contexts
This paper introduces the memory by Association and Reinforcement of Contexts
(mARC). mARC is a novel data modeling technology rooted in the second
quantization formulation of quantum mechanics. It is an all-purpose incremental
and unsupervised data storage and retrieval system which can be applied to all
types of signal or data, structured or unstructured, textual or not. mARC can
be applied to a wide range of information clas-sification and retrieval
problems like e-Discovery or contextual navigation. It can also for-mulated in
the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast
to Conway approach, the objects evolve in a massively multidimensional space.
In order to start evaluating the potential of mARC we have built a mARC-based
Internet search en-gine demonstrator with contextual functionality. We compare
the behavior of the mARC demonstrator with Google search both in terms of
performance and relevance. In the study we find that the mARC search engine
demonstrator outperforms Google search by an order of magnitude in response
time while providing more relevant results for some classes of queries
Session-based Recommendation with User Cold-Start Problem Using Markov Chain Model & Incremental Learning
Session-based recommendation has become a hot topic of intelligent system in recent years. As a sub-field of Recommending System, the session-based recommendation studies the sequential relationship of data in user's usage sessions. In some applications, the recommending system should focus more on the personalized usage feature in order to make better recommendations. This thesis analyzed the statistics of user's usage sessions and proposed the Statistics Enhanced FPMC algorithm to enhance the personalized usage pattern of users to improve the recommending performance of recommender system for in-vehicle infotainment system and APP manage system application. The proposed algorithm also addressed the user cold-start problem by incremental learning with a knowledge distillation method to alleviate the catastrophic forgetting problem. The user cold-start problem is defined as making recommendations to new users under cold-start conditions. While the usage data becomes available for new users, the model can continue to be updated to improve recommending performance.MSEElectrical Engineering, College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/167350/1/Zhengru Li - Final Thesis.pd
A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions
Graphs represent interconnected structures prevalent in a myriad of
real-world scenarios. Effective graph analytics, such as graph learning
methods, enables users to gain profound insights from graph data, underpinning
various tasks including node classification and link prediction. However, these
methods often suffer from data imbalance, a common issue in graph data where
certain segments possess abundant data while others are scarce, thereby leading
to biased learning outcomes. This necessitates the emerging field of imbalanced
learning on graphs, which aims to correct these data distribution skews for
more accurate and representative learning outcomes. In this survey, we embark
on a comprehensive review of the literature on imbalanced learning on graphs.
We begin by providing a definitive understanding of the concept and related
terminologies, establishing a strong foundational understanding for readers.
Following this, we propose two comprehensive taxonomies: (1) the problem
taxonomy, which describes the forms of imbalance we consider, the associated
tasks, and potential solutions; (2) the technique taxonomy, which details key
strategies for addressing these imbalances, and aids readers in their method
selection process. Finally, we suggest prospective future directions for both
problems and techniques within the sphere of imbalanced learning on graphs,
fostering further innovation in this critical area.Comment: The collection of awesome literature on imbalanced learning on
graphs: https://github.com/Xtra-Computing/Awesome-Literature-ILoG
Software library for stream-based recommender systems
Tradicionalmente, um algoritmo de machine learning é capaz de aprender com dados, dado um conjunto tratado e construído anteriormente. Também é possível analisar esse conjunto de dados, usando técnicas de mineração de dados e extrair conclusões a partir dele. Ambos os conceitos têm inúmeras aplicações em todo o mundo, desde diagnósticos médicos até reconhecimento de fala ou mesmo consultas a mecanismos de pesquisa. No entanto, tradicionalmente, supõe-se que o conjunto de dados esteja disponível a todo o momento. Isso não é necessariamente verdade com os dados modernos pois os aplicativos de sistemas distribuídos recebem e processam milhões de fluxos de dados em uma fração de tempo limitado. Portanto, são necessárias técnicas para extrair e processar esses fluxos de dados, em um período de tempo limitado, com bons resultados e dimensionamento eficaz à medida que os dados aumentam. Um sistema específico de análise e previsão de conclusões futuras a partir de dados fornecidos são os sistemas de recomendação. Vários serviços online usam sistemas de recomendação para fornecer conteúdo personalizado a seus usuários. Em muitos casos, as recomendações são um dos geradores de tráfego mais eficazes nesses serviços. O problema reside em encontrar o melhor pequeno subconjunto de itens em um sistema que corresponda às preferências pessoais de cada usuário, através da análise de uma quantidade muito grande de dados históricos. Esse problema recebe mais atenção se for considerado um problema genérico, não específico, ou seja, se uma biblioteca for construída para que possa ser estendida e usada como uma ferramenta para ajudar a construir um sistema para um caso de uso específico. Podem-se distinguir soluções entre perfeitas ou estatisticamente semelhantes. Devido a grande quantidade de dados disponíveis, a decisão de reprocessar todos os dados, sempre que novos dados chegam, não seria viável; portanto, algoritmos incrementais são usados para processar os dados recebidos e manter o modelo de recomendação atualizado. O objetivo real deste trabalho é implementar uma biblioteca que contenha e avalie essas abordagens incrementais para recomendações de que as atuais são específicas da tarefa.Traditionally, a machine learning algorithm is able to learn from data, given a previously built and treated data set. One can also analyze that data set, using data mining techniques, and draw conclusions from it. Both of these concepts have numerous world-wide applications, from medical diagnosis to speech recognition or even search engine queries. However, traditionally speaking, it is being assumed that the data set is available at all times. That is not necessarily true with modern data, as distributed systems applications receive and process millions of data streams on a limited time fraction. Therefore, there is a need for techniques to mine and process these data streams,on a limited time period with good results and effective scaling as data grows. One specific use case of analyzing and predicting future conclusions from given data, are recommendation systems.Several online services use recommender systems to deliver personalized content to their users.In many cases, recommendations are one of the most effective traffic generators in such services.The problem lies in finding the best small subset of items in a system that matches the personal preferences of each user, through the analysis of a very large amount of historical data. This problem gets more attention if it is considered as a generic problem, not as a specific one, that is,if a library is built so that it can be extended and used as a tool to help build a system for a specific use case. One can distinguish solutions between perfect ones or statistically similar ones. Due to the large amount of data available, the decision to reprocess all the data every time new data arrives, would not be feasible so, incremental algorithms are used to process incoming data and keeping the recommendation model updated. The real purpose of this work is to implement such a library which contains, and evaluates these incremental approaches to recommendation since current ones are task-specific
AutoAssign+: Automatic Shared Embedding Assignment in Streaming Recommendation
In the domain of streaming recommender systems, conventional methods for
addressing new user IDs or item IDs typically involve assigning initial ID
embeddings randomly. However, this practice results in two practical
challenges: (i) Items or users with limited interactive data may yield
suboptimal prediction performance. (ii) Embedding new IDs or low-frequency IDs
necessitates consistently expanding the embedding table, leading to unnecessary
memory consumption. In light of these concerns, we introduce a reinforcement
learning-driven framework, namely AutoAssign+, that facilitates Automatic
Shared Embedding Assignment Plus. To be specific, AutoAssign+ utilizes an
Identity Agent as an actor network, which plays a dual role: (i) Representing
low-frequency IDs field-wise with a small set of shared embeddings to enhance
the embedding initialization, and (ii) Dynamically determining which ID
features should be retained or eliminated in the embedding table. The policy of
the agent is optimized with the guidance of a critic network. To evaluate the
effectiveness of our approach, we perform extensive experiments on three
commonly used benchmark datasets. Our experiment results demonstrate that
AutoAssign+ is capable of significantly enhancing recommendation performance by
mitigating the cold-start problem. Furthermore, our framework yields a
reduction in memory usage of approximately 20-30%, verifying its practical
effectiveness and efficiency for streaming recommender systems
- …