25 research outputs found

    A Survey of Recommendation Systems and Performance Enhancing Methods

    Get PDF
    With the development of web services like E-commerce, job hunting websites, movie websites, recommendation system plays a more and more importance role in helping users finding their potential interests among the overloading information. There are a great number of researches available in this field, which leads to various recommendation approaches to choose from when researchers try to implement their recommendation systems. This paper gives a systematic literature review of recommendation systems where the sources are extracted from Scopus. The research problem to address, similarity metrics used, proposed method and evaluation metrics used are the focus of summary of these papers. In spite of the methodology used in traditional recommendation systems, how additional performance enhancement methods like machine learning methods, matrix factorization techniques and big data tools are applied in several papers are also introduced. Through reading this paper, researchers are able to understand what are the existing types of recommendation systems, what is the general process of recommendation systems, how the performance enhancement methods can be used to improve the system's performance. Therefore, they can choose a recommendation system which interests them for either implementation or research purpose

    User behavior modeling: Towards solving the duality of interpretability and precision

    Get PDF
    User behavior modeling has become an indispensable tool with the proliferation of socio-technical systems to provide a highly personalized experience to the users. These socio-technical systems are used in sectors as diverse as education, health, law to e-commerce, and social media. The two main challenges for user behavioral modeling are building an in-depth understanding of online user behavior and using advanced computational techniques to capture behavioral uncertainties accurately. This thesis addresses both these challenges by developing interpretable models that aid in understanding user behavior at scale and by developing sophisticated models that perform accurate modeling of user behavior. Specifically, we first propose two distinct interpretable approaches to understand explicit and latent user behavioral characteristics. Firstly, in Chapter 3, we propose an interpretable Gaussian Hidden Markov Model-based cluster model leveraging user activity data to identify users with similar patterns of behavioral evolution. We apply our approach to identify researchers with similar patterns of research interests evolution. We further show the utility of our interpretable framework to identify differences in gender distribution and the value of awarded grants among the identified archetypes. We also demonstrate generality of our approach by applying on StackExchange to identify users with a similar change in usage patterns. Next in Chapter 4, we estimate user latent behavioral characteristics by leveraging user-generated content (questions or answers) in Community Question Answering (CQA) platforms. In particular, we estimate the latent aspect-based reliability representations of users in the forum to infer the trustworthiness of their answers. We also simultaneously learn the semantic meaning of their answers through text representations. We empirically show that the estimated behavioral representations can accurately identify topical experts. We further propose to improve current behavioral models by modeling explicit and implicit user-to-user influence on user behavior. To this end, in Chapter 5, we propose a novel attention-based approach to incorporate influence from both user's social connections and other similar users on their preferences in recommender systems. Additionally, we also incorporate implicit influence in the item space by considering frequently co-occurring and similar feature items. Our modular approach captures the different influences efficiently and later fuses them in an interpretable manner. Extensive experiments show that incorporating user-to-user influence outperforms approaches relying on solely user data. User behavior remains broadly consistent across the platform. Thus, incorporating user behavioral information can be beneficial to estimate the characteristics of user-generated content. To verify it, in Chapter 6, we focus on the task of best answer selection in CQA forums that traditionally only considers textual features. We induce multiple connections between user-generated content, i.e., answers, based on the similarity and contrast in the behavior of authoring users in the platform. These induced connections enable information sharing between connected answers and, consequently, aid in estimating the quality of the answer. We also develop convolution operators to encode these semantically different graphs and later merge them using boosting. We also proposed an alternative approach to incorporate user behavioral information by jointly estimating the latent behavioral representations of user with text representations in Chapter 7. We evaluate our approach on the offensive language prediction task on Twitter. Specially, we learn an improved text representation by leveraging syntactic dependencies between the words in the tweet. We also estimate the abusive behavior of users, i.e., their likelihood of posting offensive content online from their tweets. We further show that combining the textual and user behavioral features can outperform the sophisticated textual baselines

    Novel Datasets, User Interfaces and Learner Models to Improve Learner Engagement Prediction on Educational Videos

    Get PDF
    With the emergence of Open Education Resources (OERs), educational content creation has rapidly scaled up, making a large collection of new materials made available. Among these, we find educational videos, the most popular modality for transferring knowledge in the technology-enhanced learning paradigm. Rapid creation of learning resources opens up opportunities in facilitating sustainable education, as the potential to personalise and recommend specific materials that align with individual users’ interests, goals, knowledge level, language and stylistic preferences increases. However, the quality and topical coverage of these materials could vary significantly, posing significant challenges in managing this large collection, including the risk of negative user experience and engagement with these materials. The scarcity of support resources such as public datasets is another challenge that slows down the development of tools in this research area. This thesis develops a set of novel tools that improve the recommendation of educational videos. Two novel datasets and an e-learning platform with a novel user interface are developed to support the offline and online testing of recommendation models for educational videos. Furthermore, a set of learner models that accounts for the learner interests, knowledge, novelty and popularity of content is developed through this thesis. The different models are integrated together to propose a novel learner model that accounts for the different factors simultaneously. The user studies conducted on the novel user interface show that the new interface encourages users to explore the topical content more rigorously before making relevance judgements about educational videos. Offline experiments on the newly constructed datasets show that the newly proposed learner models outperform their relevant baselines significantly

    Tabular: A Schema-driven Probabilistic Programming Language

    Get PDF
    We propose a new kind of probabilistic programming language for machine learning. We write programs simply by annotating existing relational schemas with probabilistic model expressions. We describe a detailed design of our language, Tabular, complete with formal semantics and type system. A rich series of examples illustrates the expressiveness of Tabular. We report an implementation, and show evidence of the succinctness of our notation relative to current best practice. Finally, we describe and verify a transformation of Tabular schemas so as to predict missing values in a concrete database. The ability to query for missing values provides a uniform interface to a wide variety of tasks, including classification, clustering, recommendation, and ranking

    Inferring Rankings Using Constrained Sensing

    Full text link
    We consider the problem of recovering a function over the space of permutations (or, the symmetric group) over nn elements from given partial information; the partial information we consider is related to the group theoretic Fourier Transform of the function. This problem naturally arises in several settings such as ranked elections, multi-object tracking, ranking systems, and recommendation systems. Inspired by the work of Donoho and Stark in the context of discrete-time functions, we focus on non-negative functions with a sparse support (support size ≪\ll domain size). Our recovery method is based on finding the sparsest solution (through ℓ0\ell_0 optimization) that is consistent with the available information. As the main result, we derive sufficient conditions for functions that can be recovered exactly from partial information through ℓ0\ell_0 optimization. Under a natural random model for the generation of functions, we quantify the recoverability conditions by deriving bounds on the sparsity (support size) for which the function satisfies the sufficient conditions with a high probability as n→∞n \to \infty. ℓ0\ell_0 optimization is computationally hard. Therefore, the popular compressive sensing literature considers solving the convex relaxation, ℓ1\ell_1 optimization, to find the sparsest solution. However, we show that ℓ1\ell_1 optimization fails to recover a function (even with constant sparsity) generated using the random model with a high probability as n→∞n \to \infty. In order to overcome this problem, we propose a novel iterative algorithm for the recovery of functions that satisfy the sufficient conditions. Finally, using an Information Theoretic framework, we study necessary conditions for exact recovery to be possible.Comment: 19 page

    Coarse preferences: representation, elicitation, and decision making

    Get PDF
    In this thesis we present a theory for learning and inference of user preferences with a novel hierarchical representation that captures preferential indifference. Such models of ’Coarse Preferences’ represent the space of solutions with a uni-dimensional, discrete latent space of ’categories’. This results in a partitioning of the space of solutions into preferential equivalence classes. This hierarchical model significantly reduces the computational burden of learning and inference, with improvements both in computation time and convergence behaviour with respect to number of samples. We argue that this Coarse Preferences model facilitates the efficient solution of previously computationally prohibitive recommendation procedures. The new problem of ’coordination through set recommendation’ is one such procedure where we formulate an optimisation problem by leveraging the factored nature of our representation. Furthermore, we show how an on-line learning algorithm can be used for the efficient solution of this problem. Other benefits of our proposed model include increased quality of recommendations in Recommender Systems applications, in domains where users’ behaviour is consistent with such a hierarchical preference structure. We evaluate the usefulness of our proposed model and algorithms through experiments with two recommendation domains - a clothing retailer’s online interface, and a popular movie database. Our experimental results demonstrate computational gains over state of the art methods that use an additive decomposition of preferences in on-line active learning for recommendation

    Content Prompting: Modeling Content Provider Dynamics to Improve User Welfare in Recommender Ecosystems

    Full text link
    Users derive value from a recommender system (RS) only to the extent that it is able to surface content (or items) that meet their needs/preferences. While RSs often have a comprehensive view of user preferences across the entire user base, content providers, by contrast, generally have only a local view of the preferences of users that have interacted with their content. This limits a provider's ability to offer new content to best serve the broader population. In this work, we tackle this information asymmetry with content prompting policies. A content prompt is a hint or suggestion to a provider to make available novel content for which the RS predicts unmet user demand. A prompting policy is a sequence of such prompts that is responsive to the dynamics of a provider's beliefs, skills and incentives. We aim to determine a joint prompting policy that induces a set of providers to make content available that optimizes user social welfare in equilibrium, while respecting the incentives of the providers themselves. Our contributions include: (i) an abstract model of the RS ecosystem, including content provider behaviors, that supports such prompting; (ii) the design and theoretical analysis of sequential prompting policies for individual providers; (iii) a mixed integer programming formulation for optimal joint prompting using path planning in content space; and (iv) simple, proof-of-concept experiments illustrating how such policies improve ecosystem health and user welfare

    Inferring rankings using constrained sensing

    Get PDF
    We consider the problem of recovering a function over the space of permutations (or, the symmetric group) over n elements from given partial information; the partial information we consider is related to the group theoretic Fourier Transform of the function. This problem naturally arises in several settings such as ranked elections, multi-object tracking, ranking systems, and recommendation systems. Inspired by the work of Donoho and Stark in the context of discrete-time functions, we focus on non-negative functions with a sparse support (support size <;<; domain size). Our recovery method is based on finding the sparsest solution (through l[subscript 0] optimization) that is consistent with the available information. As the main result, we derive sufficient conditions for functions that can be recovered exactly from partial information through l[subscript 0] optimization. Under a natural random model for the generation of functions, we quantify the recoverability conditions by deriving bounds on the sparsity (support size) for which the function satisfies the sufficient conditions with a high probability as n → ∞. ℓ0 optimization is computationally hard. Therefore, the popular compressive sensing literature considers solving the convex relaxation, ℓ[subscript 1] optimization, to find the sparsest solution. However, we show that ℓ[subscript 1] optimization fails to recover a function (even with constant sparsity) generated using the random model with a high probability as n → ∞. In order to overcome this problem, we propose a novel iterative algorithm for the recovery of functions that satisfy the sufficient conditions. Finally, using an Information Theoretic framework, we study necessary conditions for exact recovery to be possible

    Borda Regret Minimization for Generalized Linear Dueling Bandits

    Full text link
    Dueling bandits are widely used to model preferential feedback prevalent in many applications such as recommendation systems and ranking. In this paper, we study the Borda regret minimization problem for dueling bandits, which aims to identify the item with the highest Borda score while minimizing the cumulative regret. We propose a rich class of generalized linear dueling bandit models, which cover many existing models. We first prove a regret lower bound of order Ω(d2/3T2/3)\Omega(d^{2/3} T^{2/3}) for the Borda regret minimization problem, where dd is the dimension of contextual vectors and TT is the time horizon. To attain this lower bound, we propose an explore-then-commit type algorithm for the stochastic setting, which has a nearly matching regret upper bound O~(d2/3T2/3)\tilde{O}(d^{2/3} T^{2/3}). We also propose an EXP3-type algorithm for the adversarial linear setting, where the underlying model parameter can change at each round. Our algorithm achieves an O~(d2/3T2/3)\tilde{O}(d^{2/3} T^{2/3}) regret, which is also optimal. Empirical evaluations on both synthetic data and a simulated real-world environment are conducted to corroborate our theoretical analysis.Comment: 33 pages, 5 figure. This version includes new results for dueling bandits in the adversarial settin
    corecore