Search CORE

299 research outputs found

Deep Item-based Collaborative Filtering for Top-N Recommendation

Author: He Xiangnan
Hong Richang
Liu Kai
Wang Xiang
Xu Jiandong
Xue Feng
Publication venue
Publication date: 11/11/2018
Field of study

Item-based Collaborative Filtering(short for ICF) has been widely adopted in recommender systems in industry, owing to its strength in user interest modeling and ease in online personalization. By constructing a user's profile with the items that the user has consumed, ICF recommends items that are similar to the user's profile. With the prevalence of machine learning in recent years, significant processes have been made for ICF by learning item similarity (or representation) from data. Nevertheless, we argue that most existing works have only considered linear and shallow relationship between items, which are insufficient to capture the complicated decision-making process of users. In this work, we propose a more expressive ICF solution by accounting for the nonlinear and higher-order relationship among items. Going beyond modeling only the second-order interaction (e.g. similarity) between two items, we additionally consider the interaction among all interacted item pairs by using nonlinear neural networks. Through this way, we can effectively model the higher-order relationship among items, capturing more complicated effects in user decision-making. For example, it can differentiate which historical itemsets in a user's profile are more important in affecting the user to make a purchase decision on an item. We treat this solution as a deep variant of ICF, thus term it as DeepICF. To justify our proposal, we perform empirical studies on two public datasets from MovieLens and Pinterest. Extensive experiments verify the highly positive effect of higher-order item interaction modeling with nonlinear neural networks. Moreover, we demonstrate that by more fine-grained second-order interaction modeling with attention network, the performance of our DeepICF method can be further improved.Comment: 25 pages, submitted to TOI

arXiv.org e-Print Archive

ScholarBank@NUS

Looking for Trouble: Analyzing Classifier Behavior via Pattern Divergence

Author: Baralis Elena
de Alfaro Luca
Pastor Eliana
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

Machine learning models may perform differently on different data subgroups, which we represent as itemsets (i.e., conjunctions of simple predicates). The identification of these critical data subgroups plays an important role in many applications, for example model validation and testing, or evaluation of model fairness. Typically, domain expert help is required to identify relevant (or sensitive) subgroups. We propose the notion of divergence over itemsets as a measure of different classification behavior on data subgroups, and the use of frequent pattern mining techniques for their identification. A quantification of the contribution of different attribute values to divergence, based on the mathematical foundations provided by Shapley values, allows us to identify both critical and peculiar behaviors of attributes. Extended experiments show the effectiveness of the approach in identifying critical subgroup behaviors

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Synthetic Dataset Generation with Itemset-Based Generative Models

Author: Arias Marta
Lezcano Christian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/07/2020
Field of study

arXiv.org e-Print Archive

Crossref

Synthetic dataset generation with itemset-based generative models

Author: Arias Vicente Marta
Lezcano Ríos Christian Gerardo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

This paper proposes three different data generators, tailored to transactional datasets, based on existing itemset-based generative models. All these generators are intuitive and easy to implement and show satisfactory performance. The quality of each generator is assessed by means of three different methods that capture how well the original dataset structure is preserved.Both authors have been partially supported by TIN2017-89244-R from MINECO (Spain’s Ministerio de Economia, Industria y Competitividad) and the recognition 2017SGR-856 (MACDA) from AGAUR (Generalitat de Catalunya). Christian Lezcano is supported by Paraguay’s Foreign Postgraduate Scholarship Programme Don Carlos Antonio López (BECAL).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Data Mining Algorithms for Internet Data: from Transport to Application Layer

Author: GRIMAUDO LUIGI
Publication venue: country:Italy
Publication date: 01/01/2014
Field of study

Nowadays we live in a data-driven world. Advances in data generation, collection and storage technology have enabled organizations to gather data sets of massive size. Data mining is a discipline that blends traditional data analysis methods with sophisticated algorithms to handle the challenges posed by these new types of data sets. The Internet is a complex and dynamic system with new protocols and applications that arise at a constant pace. All these characteristics designate the Internet a valuable and challenging data source and application domain for a research activity, both looking at Transport layer, analyzing network tra c flows, and going up to Application layer, focusing on the ever-growing next generation web services: blogs, micro-blogs, on-line social networks, photo sharing services and many other applications (e.g., Twitter, Facebook, Flickr, etc.). In this thesis work we focus on the study, design and development of novel algorithms and frameworks to support large scale data mining activities over huge and heterogeneous data volumes, with a particular focus on Internet data as data source and targeting network tra c classification, on-line social network analysis, recommendation systems and cloud services and Big data

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Exploring the Evolution of Node Neighborhoods in Dynamic Networks

Author: Labatut Vincent
Naskali Ahmet Teoman
Orman Günce Keziban
Publication venue: 'Elsevier BV'
Publication date: 05/05/2017
Field of study

Dynamic Networks are a popular way of modeling and studying the behavior of evolving systems. However, their analysis constitutes a relatively recent subfield of Network Science, and the number of available tools is consequently much smaller than for static networks. In this work, we propose a method specifically designed to take advantage of the longitudinal nature of dynamic networks. It characterizes each individual node by studying the evolution of its direct neighborhood, based on the assumption that the way this neighborhood changes reflects the role and position of the node in the whole network. For this purpose, we define the concept of \textit{neighborhood event}, which corresponds to the various transformations such groups of nodes can undergo, and describe an algorithm for detecting such events. We demonstrate the interest of our method on three real-world networks: DBLP, LastFM and Enron. We apply frequent pattern mining to extract meaningful information from temporal sequences of neighborhood events. This results in the identification of behavioral trends emerging in the whole network, as well as the individual characterization of specific nodes. We also perform a cluster analysis, which reveals that, in all three networks, one can distinguish two types of nodes exhibiting different behaviors: a very small group of active nodes, whose neighborhood undergo diverse and frequent events, and a very large group of stable nodes

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot