850 research outputs found

    On Estimating Multi-Attribute Choice Preferences using Private Signals and Matrix Factorization

    Full text link
    Revealed preference theory studies the possibility of modeling an agent's revealed preferences and the construction of a consistent utility function. However, modeling agent's choices over preference orderings is not always practical and demands strong assumptions on human rationality and data-acquisition abilities. Therefore, we propose a simple generative choice model where agents are assumed to generate the choice probabilities based on latent factor matrices that capture their choice evaluation across multiple attributes. Since the multi-attribute evaluation is typically hidden within the agent's psyche, we consider a signaling mechanism where agents are provided with choice information through private signals, so that the agent's choices provide more insight about his/her latent evaluation across multiple attributes. We estimate the choice model via a novel multi-stage matrix factorization algorithm that minimizes the average deviation of the factor estimates from choice data. Simulation results are presented to validate the estimation performance of our proposed algorithm.Comment: 6 pages, 2 figures, to be presented at CISS conferenc

    Data-driven Computational Social Science: A Survey

    Get PDF
    Social science concerns issues on individuals, relationships, and the whole society. The complexity of research topics in social science makes it the amalgamation of multiple disciplines, such as economics, political science, and sociology, etc. For centuries, scientists have conducted many studies to understand the mechanisms of the society. However, due to the limitations of traditional research methods, there exist many critical social issues to be explored. To solve those issues, computational social science emerges due to the rapid advancements of computation technologies and the profound studies on social science. With the aids of the advanced research techniques, various kinds of data from diverse areas can be acquired nowadays, and they can help us look into social problems with a new eye. As a result, utilizing various data to reveal issues derived from computational social science area has attracted more and more attentions. In this paper, to the best of our knowledge, we present a survey on data-driven computational social science for the first time which primarily focuses on reviewing application domains involving human dynamics. The state-of-the-art research on human dynamics is reviewed from three aspects: individuals, relationships, and collectives. Specifically, the research methodologies used to address research challenges in aforementioned application domains are summarized. In addition, some important open challenges with respect to both emerging research topics and research methods are discussed.Comment: 28 pages, 8 figure

    Tag based Bayesian latent class models for movies : economic theory reaches out to big data science

    Get PDF
    For the past 50 years, cultural economics has developed as an independent research specialism. At its core are the creative industries and the peculiar economics associated with them, central to which is a tension that arises from the notion that creative goods need to be experienced before an assessment can be made about the utility they deliver to the consumer. In this they differ from the standard private good that forms the basis of demand theory in economic textbooks, in which utility is known ex ante. Furthermore, creative goods are typically complex in composition and subject to heterogeneous and shifting consumer preferences. In response to this, models of linear optimization, rational addiction and Bayesian learning have been applied to better understand consumer decision- making, belief formation and revision. While valuable, these approaches do not lend themselves to forming verifiable hypothesis for the critical reason that they by-pass an essential aspect of creative products: namely, that of novelty. In contrast, computer sciences, and more specifically recommender theory, embrace creative products as a study object. Being items of online transactions, users of creative products share opinions on a massive scale and in doing so generate a flow of data driven research. Not limited by the multiple assumptions made in economic theory, data analysts deal with this type of commodity in a less constrained way, incorporating the variety of item characteristics, as well as their co-use by agents. They apply statistical techniques supporting big data, such as clustering, latent class analysis or singular value decomposition. This thesis is drawn from both disciplines, comparing models, methods and data sets. Based upon movie consumption, the work contrasts bottom-up versus top-down approaches, individual versus collective data, distance measures versus the utility-based comparisons. Rooted in Bayesian latent class models, a synthesis is formed, supported by the random utility theory and recommender algorithm methods. The Bayesian approach makes explicit the experience good nature of creative goods by formulating the prior uncertainty of users towards both movie features and preferences. The latent class method, thus, infers the heterogeneous aspect of preferences, while its dynamic variant- the latent Markov model - gets around one of the main paradoxes in studying creative products: how to analyse taste dynamics when confronted with a good that is novel at each decision point. Generated by mainly movie-user-rating and movie-user-tag triplets, collected from the Movielens recommender system and made available as open data for research by the GroupLens research team, this study of preference patterns formation for creative goods is drawn from individual level data

    Understanding Social Media Users via Attributes and Links

    Get PDF
    abstract: With the rise of social media, hundreds of millions of people spend countless hours all over the globe on social media to connect, interact, share, and create user-generated data. This rich environment provides tremendous opportunities for many different players to easily and effectively reach out to people, interact with them, influence them, or get their opinions. There are two pieces of information that attract most attention on social media sites, including user preferences and interactions. Businesses and organizations use this information to better understand and therefore provide customized services to social media users. This data can be used for different purposes such as, targeted advertisement, product recommendation, or even opinion mining. Social media sites use this information to better serve their users. Despite the importance of personal information, in many cases people do not reveal this information to the public. Predicting the hidden or missing information is a common response to this challenge. In this thesis, we address the problem of predicting user attributes and future or missing links using an egocentric approach. The current research proposes novel concepts and approaches to better understand social media users in twofold including, a) their attributes, preferences, and interests, and b) their future or missing connections and interactions. More specifically, the contributions of this dissertation are (1) proposing a framework to study social media users through their attributes and link information, (2) proposing a scalable algorithm to predict user preferences; and (3) proposing a novel approach to predict attributes and links with limited information. The proposed algorithms use an egocentric approach to improve the state of the art algorithms in two directions. First by improving the prediction accuracy, and second, by increasing the scalability of the algorithms.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    CONTEXT AWARE PRIVACY PRESERVING CLUSTERING AND CLASSIFICATION

    Get PDF
    Data are valuable assets to any organizations or individuals. Data are sources of useful information which is a big part of decision making. All sectors have potential to benefit from having information. Commerce, health, and research are some of the fields that have benefited from data. On the other hand, the availability of the data makes it easy for anyone to exploit the data, which in many cases are private confidential data. It is necessary to preserve the confidentiality of the data. We study two categories of privacy: Data Value Hiding and Data Pattern Hiding. Privacy is a huge concern but equally important is the concern of data utility. Data should avoid privacy breach yet be usable. Although these two objectives are contradictory and achieving both at the same time is challenging, having knowledge of the purpose and the manner in which it will be utilized helps. In this research, we focus on some particular situations for clustering and classification problems and strive to balance the utility and privacy of the data. In the first part of this dissertation, we propose Nonnegative Matrix Factorization (NMF) based techniques that accommodate constraints defined explicitly into the update rules. These constraints determine how the factorization takes place leading to the favorable results. These methods are designed to make alterations on the matrices such that user-specified cluster properties are introduced. These methods can be used to preserve data value as well as data pattern. As NMF and K-means are proven to be equivalent, NMF is an ideal choice for pattern hiding for clustering problems. In addition to the NMF based methods, we propose methods that take into account the data structures and the attribute properties for the classification problems. We separate the work into two different parts: linear classifiers and nonlinear classifiers. We propose two different solutions based on the classifiers. We study the effect of distortion on the utility of data. We propose three distortion measurement metrics which demonstrate better characteristics than the traditional metrics. The effectiveness of the measures is examined on different benchmark datasets. The result shows that the methods have the desirable properties such as invariance to translation, rotation, and scaling

    Collaborative-demographic hybrid for financial: product recommendation

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsDue to the increased availability of mature data mining and analysis technologies supporting CRM processes, several financial institutions are striving to leverage customer data and integrate insights regarding customer behaviour, needs, and preferences into their marketing approach. As decision support systems assisting marketing and commercial efforts, Recommender Systems applied to the financial domain have been gaining increased attention. This thesis studies a Collaborative- Demographic Hybrid Recommendation System, applied to the financial services sector, based on real data provided by a Portuguese private commercial bank. This work establishes a framework to support account managers’ advice on which financial product is most suitable for each of the bank’s corporate clients. The recommendation problem is further developed by conducting a performance comparison for both multi-output regression and multiclass classification prediction approaches. Experimental results indicate that multiclass architectures are better suited for the prediction task, outperforming alternative multi-output regression models on the evaluation metrics considered. Withal, multiclass Feed-Forward Neural Networks, combined with Recursive Feature Elimination, is identified as the topperforming algorithm, yielding a 10-fold cross-validated F1 Measure of 83.16%, and achieving corresponding values of Precision and Recall of 84.34%, and 85.29%, respectively. Overall, this study provides important contributions for positioning the bank’s commercial efforts around customers’ future requirements. By allowing for a better understanding of customers’ needs and preferences, the proposed Recommender allows for more personalized and targeted marketing contacts, leading to higher conversion rates, corporate profitability, and customer satisfaction and loyalty

    Probabilistic Personalized Recommendation Models For Heterogeneous Social Data

    Get PDF
    Content recommendation has risen to a new dimension with the advent of platforms like Twitter, Facebook, FriendFeed, Dailybooth, and Instagram. Although this uproar of data has provided us with a goldmine of real-world information, the problem of information overload has become a major barrier in developing predictive models. Therefore, the objective of this The- sis is to propose various recommendation, prediction and information retrieval models that are capable of leveraging such vast heterogeneous content. More specifically, this Thesis focuses on proposing models based on probabilistic generative frameworks for the following tasks: (a) recommending backers and projects in Kickstarter crowdfunding domain and (b) point of interest recommendation in Foursquare. Through comprehensive set of experiments over a variety of datasets, we show that our models are capable of providing practically useful results for recommendation and information retrieval tasks

    A PRACTICAL THEORY OF FUNGIBILITY

    Get PDF
    We formalize 'degrees of fungibility' by differentiating goods according to both their underlying attributes and the perceived value and/or usefulness of those attributes to a value assessor. This allows us to distinguish between goods that appear to be 'exactly the same' from those goods that appear to be 'nearly the same'. Such a distinction is of particular importance in the design space of digital goods, which may exist both natively in the digital space and as surrogates, i.e. as digital representations of physical goods. We provide motivating examples where digital objects are too fungible for certain desired uses, and proceed to develop a formal framework under which degrees of fungibility can be defined and characterized. We close by bridging this framework to applications in machine learning and market design.Series: Working Paper Series / Institute for Cryptoeconomics / Interdisciplinary Researc

    Enhancing explainability and scrutability of recommender systems

    Get PDF
    Our increasing reliance on complex algorithms for recommendations calls for models and methods for explainable, scrutable, and trustworthy AI. While explainability is required for understanding the relationships between model inputs and outputs, a scrutable system allows us to modify its behavior as desired. These properties help bridge the gap between our expectations and the algorithm’s behavior and accordingly boost our trust in AI. Aiming to cope with information overload, recommender systems play a crucial role in filtering content (such as products, news, songs, and movies) and shaping a personalized experience for their users. Consequently, there has been a growing demand from the information consumers to receive proper explanations for their personalized recommendations. These explanations aim at helping users understand why certain items are recommended to them and how their previous inputs to the system relate to the generation of such recommendations. Besides, in the event of receiving undesirable content, explanations could possibly contain valuable information as to how the system’s behavior can be modified accordingly. In this thesis, we present our contributions towards explainability and scrutability of recommender systems: • We introduce a user-centric framework, FAIRY, for discovering and ranking post-hoc explanations for the social feeds generated by black-box platforms. These explanations reveal relationships between users’ profiles and their feed items and are extracted from the local interaction graphs of users. FAIRY employs a learning-to-rank (LTR) method to score candidate explanations based on their relevance and surprisal. • We propose a method, PRINCE, to facilitate provider-side explainability in graph-based recommender systems that use personalized PageRank at their core. PRINCE explanations are comprehensible for users, because they present subsets of the user’s prior actions responsible for the received recommendations. PRINCE operates in a counterfactual setup and builds on a polynomial-time algorithm for finding the smallest counterfactual explanations. • We propose a human-in-the-loop framework, ELIXIR, for enhancing scrutability and subsequently the recommendation models by leveraging user feedback on explanations. ELIXIR enables recommender systems to collect user feedback on pairs of recommendations and explanations. The feedback is incorporated into the model by imposing a soft constraint for learning user-specific item representations. We evaluate all proposed models and methods with real user studies and demonstrate their benefits at achieving explainability and scrutability in recommender systems.Unsere zunehmende Abhängigkeit von komplexen Algorithmen für maschinelle Empfehlungen erfordert Modelle und Methoden für erklärbare, nachvollziehbare und vertrauenswürdige KI. Zum Verstehen der Beziehungen zwischen Modellein- und ausgaben muss KI erklärbar sein. Möchten wir das Verhalten des Systems hingegen nach unseren Vorstellungen ändern, muss dessen Entscheidungsprozess nachvollziehbar sein. Erklärbarkeit und Nachvollziehbarkeit von KI helfen uns dabei, die Lücke zwischen dem von uns erwarteten und dem tatsächlichen Verhalten der Algorithmen zu schließen und unser Vertrauen in KI-Systeme entsprechend zu stärken. Um ein Übermaß an Informationen zu verhindern, spielen Empfehlungsdienste eine entscheidende Rolle um Inhalte (z.B. Produkten, Nachrichten, Musik und Filmen) zu filtern und deren Benutzern eine personalisierte Erfahrung zu bieten. Infolgedessen erheben immer mehr In- formationskonsumenten Anspruch auf angemessene Erklärungen für deren personalisierte Empfehlungen. Diese Erklärungen sollen den Benutzern helfen zu verstehen, warum ihnen bestimmte Dinge empfohlen wurden und wie sich ihre früheren Eingaben in das System auf die Generierung solcher Empfehlungen auswirken. Außerdem können Erklärungen für den Fall, dass unerwünschte Inhalte empfohlen werden, wertvolle Informationen darüber enthalten, wie das Verhalten des Systems entsprechend geändert werden kann. In dieser Dissertation stellen wir unsere Beiträge zu Erklärbarkeit und Nachvollziehbarkeit von Empfehlungsdiensten vor. • Mit FAIRY stellen wir ein benutzerzentriertes Framework vor, mit dem post-hoc Erklärungen für die von Black-Box-Plattformen generierten sozialen Feeds entdeckt und bewertet werden können. Diese Erklärungen zeigen Beziehungen zwischen Benutzerprofilen und deren Feeds auf und werden aus den lokalen Interaktionsgraphen der Benutzer extrahiert. FAIRY verwendet eine LTR-Methode (Learning-to-Rank), um die Erklärungen anhand ihrer Relevanz und ihres Grads unerwarteter Empfehlungen zu bewerten. • Mit der PRINCE-Methode erleichtern wir das anbieterseitige Generieren von Erklärungen für PageRank-basierte Empfehlungsdienste. PRINCE-Erklärungen sind für Benutzer verständlich, da sie Teilmengen früherer Nutzerinteraktionen darstellen, die für die erhaltenen Empfehlungen verantwortlich sind. PRINCE-Erklärungen sind somit kausaler Natur und werden von einem Algorithmus mit polynomieller Laufzeit erzeugt , um präzise Erklärungen zu finden. • Wir präsentieren ein Human-in-the-Loop-Framework, ELIXIR, um die Nachvollziehbarkeit der Empfehlungsmodelle und die Qualität der Empfehlungen zu verbessern. Mit ELIXIR können Empfehlungsdienste Benutzerfeedback zu Empfehlungen und Erklärungen sammeln. Das Feedback wird in das Modell einbezogen, indem benutzerspezifischer Einbettungen von Objekten gelernt werden. Wir evaluieren alle Modelle und Methoden in Benutzerstudien und demonstrieren ihren Nutzen hinsichtlich Erklärbarkeit und Nachvollziehbarkeit von Empfehlungsdiensten
    corecore