12 research outputs found

    Beyond Labels: Leveraging Deep Learning and LLMs for Content Metadata

    Full text link
    Content metadata plays a very important role in movie recommender systems as it provides valuable information about various aspects of a movie such as genre, cast, plot synopsis, box office summary, etc. Analyzing the metadata can help understand the user preferences to generate personalized recommendations and item cold starting. In this talk, we will focus on one particular type of metadata - \textit{genre} labels. Genre labels associated with a movie or a TV series help categorize a collection of titles into different themes and correspondingly setting up the audience expectation. We present some of the challenges associated with using genre label information and propose a new way of examining the genre information that we call as the \textit{Genre Spectrum}. The Genre Spectrum helps capture the various nuanced genres in a title and our offline and online experiments corroborate the effectiveness of the approach. Furthermore, we also talk about applications of LLMs in augmenting content metadata which could eventually be used to achieve effective organization of recommendations in user's 2-D home-grid

    Recommendations and User Agency: The Reachability of Collaboratively-Filtered Information

    Full text link
    Recommender systems often rely on models which are trained to maximize accuracy in predicting user preferences. When the systems are deployed, these models determine the availability of content and information to different users. The gap between these objectives gives rise to a potential for unintended consequences, contributing to phenomena such as filter bubbles and polarization. In this work, we consider directly the information availability problem through the lens of user recourse. Using ideas of reachability, we propose a computationally efficient audit for top-NN linear recommender models. Furthermore, we describe the relationship between model complexity and the effort necessary for users to exert control over their recommendations. We use this insight to provide a novel perspective on the user cold-start problem. Finally, we demonstrate these concepts with an empirical investigation of a state-of-the-art model trained on a widely used movie ratings dataset.Comment: appeared at FAccT '2

    Mining relationships in spatio-temporal datasets

    No full text
    University of Minnesota Ph.D. dissertation. January 2013. Major: Computer science. Advisor: Vipin Kumar. 1 computer file (PDF); xi, 140 pages

    Constrained Spectral Clustering using L1 Regularization

    No full text
    Constrained spectral clustering is a semi-supervised learning problem that aims at incorporating userdefined constraints in spectral clustering. Typically, there are two kinds of constraints: (i) must-link, and (ii) cannot-link. These constraints represent prior knowledge indicating whether two data objects should be in the same cluster or not; thereby aiding in clustering. In this paper, we propose a novel approach that uses convex subproblems to incorporate constraints in spectral clustering and co-clustering. In comparison to the prior state-of-art approaches, our approach presents a more natural way to incorporate constraints in the spectral methods and allows us to make a trade off between the number of satisfied constraints and the quality of partitions on the original graph. We use an L1 regularizer analogous to LASSO, often used in literature to induce sparsity, in order to control the number of constraints satisfied. Our approach can handle both must-link and cannot-link constraints, unlike a large number of previous approaches that mainly work on the former. Further, our formulation is based on the reduction to a convex subproblem which is relatively easy to solve using existing solvers. We test our proposed approach on real world datasets and show its effectiveness for both spectral clustering and co-clustering over the prior state-of-art.

    Connecting mutually influencing bloggers

    No full text
    The blogosphere shows the characteristics of a power law distribution where a small set of the bloggers (influentials) get the majority of readership and the vast majority receives little traffic. Blogger recommendation algorithms aim at finding influentials for recommendation, putting bloggers with limited readership at further disadvantage. These bloggers could benefit from mutual endorsement of each other with the eventual goal of forming strong local communities with broader readership. In this paper, we propose a recommendation algorithm to connect blogger pairs with the intent that once connected the bloggers would share a mutually influencing relationship between them. In particular, we compute bloggers’ influence profile based on how much she influences her blog friends and recommend bloggers with similar influence profiles. We characterize bloggers into four different groups: {global leaders, connectors, local leaders, isolates}. Our result shows marginal benefit for isolates and significant benefit for local leaders. Our approach can be instructive in building intelligent recommendation engine for bloggers with limited readership to build strong local communities

    Churn Prediction in MMORPGs : A Social Influence Based Approach

    No full text
    (MMORPGs) are computer based games in which players interact with one another in the virtual world. Worldwide revenues for MMORPGs have seen amazing growth in last few years and it is more than a 2 billion dollars industry as per current estimates. Huge amount of revenue potential has attracted several gaming companies to launch online role playing games. One of the major problems these companies suffer apart from fierce competition is erosion of their customer base. Churn is a big problem for the gaming companies as churners impact negatively in the ”word-of-mouth ” reports for potential and existing customers leading to further erosion of user base. We study the problem of player churn in the popular MMORPG EverQuest II. The problem of churn prediction has been studied extensively in the past in various domains and social network analysis has recently been applied to the proble

    Discovering Dynamic Dipoles in Climate Data

    No full text
    Pressure dipoles are important long distance climate phenomena (teleconnection) characterized by pressure anomalies of opposite polarity appearing at two different locations at the same time. Such dipoles have proven important for understanding and explaining the variability in climate in many regions of the world, e.g., the El Niño climate phenomenon is known to be responsible for precipitation and temperature anomalies worldwide. This paper presents a novel approach for dipole discovery that outperforms existing state of the art algorithms. Our approach is based on a climate anomaly network that is constructed using the correlation of time series of climate variables at all the locations on the Earth. One novel aspect of our approach to the analysis of such networks is a careful treatment of negative correlations, whose proper consideration is critical for finding dipoles. Another key insight provided by our work is the importance of modeling the time dependent patterns of the dipoles in order to better capture the impact of important climate phenomena on land. The results presented in this paper show that these innovations allow our approach to produce better results than previous approaches in terms of matching existing climate indices with high correlation and capturing the impact of climate indices on land.

    Tracking Spatio-Temporal Diffusion in Climate Data

    No full text
    A forest canopy forms a critical platform for complex interactions between the vegetation and the atmosphere boundary layer and is considered as a crucial piece for environmental scientists in their understanding of the ecosystem and its response to the climate change. Microfronts represent a class of these interactions characterized by a moving mass of air that introduce fluctuations in ambient temperature and humidity on small spatial and temporal scales. In this paper, we present a joint spatio-temporal hidden markov model that simultaneously incorporates neighborhood dependencies in space and time. We show that our approach can trace the diffusion of microfronts more effectively than several baseline methods over a sensor data from Brazilian rainforest and a synthetically generated dataset.
    corecore