14 research outputs found

    Recommender Systems

    Get PDF
    The ongoing rapid expansion of the Internet greatly increases the necessity of effective recommender systems for filtering the abundant information. Extensive research for recommender systems is conducted by a broad range of communities including social and computer scientists, physicists, and interdisciplinary researchers. Despite substantial theoretical and practical achievements, unification and comparison of different approaches are lacking, which impedes further advances. In this article, we review recent developments in recommender systems and discuss the major challenges. We compare and evaluate available algorithms and examine their roles in the future developments. In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed. We emphasize that recommendation has a great scientific depth and combines diverse research fields which makes it of interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports

    Clustering-Based Personalization

    Get PDF
    Recommendation systems have been the most emerging technology in the last decade as one of the key parts in e-commerce ecosystem. Businesses offer a wide variety of items and contents through different channels such as Internet, Smart TVs, Digital Screens, etc. The number of these items sometimes goes over millions for some businesses. Therefore, users can have trouble finding the products that they are looking for. Recommendation systems address this problem by providing powerful methods which enable users to filter through large information and product space based on their preferences. Moreover, users have different preferences. Thus, businesses can employ recommendation systems to target more audiences by addressing them with personalized content. Recent studies show a significant improvement of revenue and conversion rate for recommendation system adopters. Accuracy, scalability, comprehensibility, and data sparsity are main challenges in recommendation systems. Businesses need practical and scalable recommendation models which accurately personalize millions of items for millions of users in real-time. They also prefer comprehensible recommendations to understand how these models target their users. However, data sparsity and lack of enough data about items, users and their interests prevent personalization models to generate accurate recommendations. In Chapter 1, we first describe basic definitions in recommendation systems. We then shortly review our contributions and their importance in this thesis. Then in Chapter 2, we review the major solutions in this context. Traditional recommendation system methods usually make a rating matrix based on the observed ratings of users on items. This rating matrix is then employed in different data mining techniques to predict the unknown rating values based on the known values. In a novel solution, in Chapter 3, we capture the mean interest of the cluster of users on the cluster of items in a cluster-level rating matrix. We first cluster users and items separately based on the known ratings. In a new matrix, we then present the interest of each user clusters on each item clusters by averaging the ratings of users inside each user cluster on the items belonging to each item cluster. Then, we apply the matrix factorization method on this coarse matrix to predict the future cluster-level interests. Our final rating prediction includes an aggregation of the traditional user-item rating predictions and our cluster-level rating predictions. Generating personalized recommendation for cold-start users, or users with only few feedback, is a big challenge in recommendation systems. Employing any available information from these users in other domains is crucial to improve their recommendation accuracy. Thus, in Chapter 4, we extend our proposed clustering-based recommendation model by including the auxiliary feedback in other domains. In a new cluster-level rating matrix, we capture the cluster-level interests between the domains to reduce the sparsity of the known ratings. By factorizing this cross-domain rating matrix, we effectively utilize data from auxiliary domains to achieve better recommendations in the target domain, especially for cold-start users. In Chapter 5, we apply our proposed clustering-based recommendation system to Morphio platform used in a local digital marketing agency called Arcane inc. Morphio is an smart adaptive web platform, which is designed to help Arcane to produce smart contents and target more audiences. In Morphio, agencies can define multiple versions of content including texts, images, colors, and so on for their web pages. A personalization module then matches a version of content to each user using their profiles. Our ongoing real time experiment shows a significant improvement of user conversion employing our proposed clustering-based personalization. Finally, in Chapter 6, we present a summary and conclusions for this thesis. Parts of this thesis were submitted or published in peer-review journal and conferences including ACM Transactions on Knowledge Discovery from Data and ACM Conferences on Recommender Systems

    Blockmodeling Techniques for Complex Networks.

    Full text link
    The class of network models known as stochastic blockmodels has recently been gaining popularity. In this dissertation, we present new work that uses blockmodels to answer questions about networks. We create a blockmodel based on the idea of link communities, which naturally gives rise to overlapping vertex communities. We derive a fast and accurate algorithm to fit the model to networks. This model can be related to another blockmodel, which allows the method to efficiently find nonoverlapping communities as well. We then create a heuristic based on the link community model whose use is to find the correct number of communities in a network. The heuristic is based on intuitive corrections to likelihood ratio tests. It does a good job finding the correct number of communities in both real networks and synthetic networks generated from the link communities model. Two commonly studied types of networks are citation networks, where research papers cite other papers, and coauthorship networks, where authors are connected if they've written a paper together. We study a multi-modal network from a large dataset of Physics publications that is the combination of the two, allowing for directed links between papers as citations, and an undirected edge between a scientist and a paper if they helped to write it. This allows for new insights on the relation between social interaction and scientific production. We also have the publication dates of papers, which lets us track our measures over time. Finally, we create a stochastic model for ranking vertices in a semi-directed network. The probability of connection between two vertices depends on the difference of their ranks. When this model is fit to high school friendship networks, the ranks appear to correspond with a measure of social status. Students have reciprocated and some unreciprocated edges with other students of closely similar rank that correspond to true friendship, and claim an aspirational friendship with a much higher ranked individual a fraction of the time. In general, students with more friends have higher ranks than those with fewer friends, and older students have higher ranks than younger students.PhDPhysicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108855/1/briball_1.pd

    Tensor Learning for Recovering Missing Information: Algorithms and Applications on Social Media

    Get PDF
    Real-time social systems like Facebook, Twitter, and Snapchat have been growing rapidly, producing exabytes of data in different views or aspects. Coupled with more and more GPS-enabled sharing of videos, images, blogs, and tweets that provide valuable information regarding “who”, “where”, “when” and “what”, these real-time human sensor data promise new research opportunities to uncover models of user behavior, mobility, and information sharing. These real-time dynamics in social systems usually come in multiple aspects, which are able to help better understand the social interactions of the underlying network. However, these multi-aspect datasets are often raw and incomplete owing to various unpredictable or unavoidable reasons; for instance, API limitations and data sampling policies can lead to an incomplete (and often biased) perspective on these multi-aspect datasets. This missing data could raise serious concerns such as biased estimations on structural properties of the network and properties of information cascades in social networks. In order to recover missing values or information in social systems, we identify “4S” challenges: extreme sparsity of the observed multi-aspect datasets, adoption of rich side information that is able to describe the similarities of entities, generation of robust models rather than limiting them on specific applications, and scalability of models to handle real large-scale datasets (billions of observed entries). With these challenges in mind, this dissertation aims to develop scalable and interpretable tensor-based frameworks, algorithms and methods for recovering missing information on social media. In particular, this dissertation research makes four unique contributions: _ The first research contribution of this dissertation research is to propose a scalable framework based on low-rank tensor learning in the presence of incomplete information. Concretely, we formally define the problem of recovering the spatio-temporal dynamics of online memes and tackle this problem by proposing a novel tensor-based factorization approach based on the alternative direction method of multipliers (ADMM) with the integration of the latent relationships derived from contextual information among locations, memes, and times. _ The second research contribution of this dissertation research is to evaluate the generalization of the proposed tensor learning framework and extend it to the recommendation problem. In particular, we develop a novel tensor-based approach to solve the personalized expert recommendation by integrating both the latent relationships between homogeneous entities (e.g., users and users, experts and experts) and the relationships between heterogeneous entities (e.g., users and experts, topics and experts) from the geo-spatial, topical, and social contexts. _ The third research contribution of this dissertation research is to extend the proposed tensor learning framework to the user topical profiling problem. Specifically, we propose a tensor-based contextual regularization model embedded into a matrix factorization framework, which leverages the social, textual, and behavioral contexts across users, in order to overcome identified challenges. _ The fourth research contribution of this dissertation research is to scale up the proposed tensor learning framework to be capable of handling real large-scale datasets that are too big to fit in the main memory of a single machine. Particularly, we propose a novel distributed tensor completion algorithm with the trace-based regularization of the auxiliary information based on ADMM under the proposed tensor learning framework, which is designed to scale up to real large-scale tensors (e.g., billions of entries) by efficiently computing auxiliary variables, minimizing intermediate data, and reducing the workload of updating new tensors

    FCAIR 2012 Formal Concept Analysis Meets Information Retrieval Workshop co-located with the 35th European Conference on Information Retrieval (ECIR 2013) March 24, 2013, Moscow, Russia

    Get PDF
    International audienceFormal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classifiation. The area came into being in the early 1980s and has since then spawned over 10000 scientific publications and a variety of practically deployed tools. FCA allows one to build from a data table with objects in rows and attributes in columns a taxonomic data structure called concept lattice, which can be used for many purposes, especially for Knowledge Discovery and Information Retrieval. The Formal Concept Analysis Meets Information Retrieval (FCAIR) workshop collocated with the 35th European Conference on Information Retrieval (ECIR 2013) was intended, on the one hand, to attract researchers from FCA community to a broad discussion of FCA-based research on information retrieval, and, on the other hand, to promote ideas, models, and methods of FCA in the community of Information Retrieval

    Tensor Learning for Recovering Missing Information: Algorithms and Applications on Social Media

    Get PDF
    Real-time social systems like Facebook, Twitter, and Snapchat have been growing rapidly, producing exabytes of data in different views or aspects. Coupled with more and more GPS-enabled sharing of videos, images, blogs, and tweets that provide valuable information regarding “who”, “where”, “when” and “what”, these real-time human sensor data promise new research opportunities to uncover models of user behavior, mobility, and information sharing. These real-time dynamics in social systems usually come in multiple aspects, which are able to help better understand the social interactions of the underlying network. However, these multi-aspect datasets are often raw and incomplete owing to various unpredictable or unavoidable reasons; for instance, API limitations and data sampling policies can lead to an incomplete (and often biased) perspective on these multi-aspect datasets. This missing data could raise serious concerns such as biased estimations on structural properties of the network and properties of information cascades in social networks. In order to recover missing values or information in social systems, we identify “4S” challenges: extreme sparsity of the observed multi-aspect datasets, adoption of rich side information that is able to describe the similarities of entities, generation of robust models rather than limiting them on specific applications, and scalability of models to handle real large-scale datasets (billions of observed entries). With these challenges in mind, this dissertation aims to develop scalable and interpretable tensor-based frameworks, algorithms and methods for recovering missing information on social media. In particular, this dissertation research makes four unique contributions: _ The first research contribution of this dissertation research is to propose a scalable framework based on low-rank tensor learning in the presence of incomplete information. Concretely, we formally define the problem of recovering the spatio-temporal dynamics of online memes and tackle this problem by proposing a novel tensor-based factorization approach based on the alternative direction method of multipliers (ADMM) with the integration of the latent relationships derived from contextual information among locations, memes, and times. _ The second research contribution of this dissertation research is to evaluate the generalization of the proposed tensor learning framework and extend it to the recommendation problem. In particular, we develop a novel tensor-based approach to solve the personalized expert recommendation by integrating both the latent relationships between homogeneous entities (e.g., users and users, experts and experts) and the relationships between heterogeneous entities (e.g., users and experts, topics and experts) from the geo-spatial, topical, and social contexts. _ The third research contribution of this dissertation research is to extend the proposed tensor learning framework to the user topical profiling problem. Specifically, we propose a tensor-based contextual regularization model embedded into a matrix factorization framework, which leverages the social, textual, and behavioral contexts across users, in order to overcome identified challenges. _ The fourth research contribution of this dissertation research is to scale up the proposed tensor learning framework to be capable of handling real large-scale datasets that are too big to fit in the main memory of a single machine. Particularly, we propose a novel distributed tensor completion algorithm with the trace-based regularization of the auxiliary information based on ADMM under the proposed tensor learning framework, which is designed to scale up to real large-scale tensors (e.g., billions of entries) by efficiently computing auxiliary variables, minimizing intermediate data, and reducing the workload of updating new tensors

    User Behavior Mining in Microblogging

    Get PDF

    Promoting Andean children's learning of science through cultural and digital tools

    Get PDF
    Conference Theme: To see the world and a grain of sand: Learning across levels of space, time, and scaleIn Peru, there is a large achievement gap in rural schools. In order to overcome this problem, the study aims to design environments that enhance science learning through the integration of ICT with cultural artifacts, respecting the Andean culture and empower rural children to pursue lifelong learning. This investigation employs the Cultural-Historical Activity Theory (CHAT) framework, and the Design-Based Research (DBR) methodology using an iterative process of design, implementation and evaluation of the innovative practice.published_or_final_versio

    Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

    Get PDF
    Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute
    corecore