5,365 research outputs found
Choice of Metrics used in Collaborative Filtering and their Impact on Recommender Systems
The capacity of recommender systems to make correct predictions is essentially determined by the quality and suitability of the collaborative filtering that implements them. The common memory-based metrics are Pearson correlation and cosine, however, their use is not always the most appropriate or sufficiently justified. In this paper, we analyze these two metrics together with the less common mean squared difference (MSD) to discover their advantages and drawbacks in very important aspects such as the impact when introducing different values of k-neighborhoods, minimization of the MAE error, capacity to carry out a sufficient number of predictions, percentage of correct and incorrect predictions and behavior when attempting to recommend the n-best items. The paper lists the results and practical conclusions that have been obtained after carrying out a comparative study of the metrics based on 135 experiments on the MovieLens database of 100,000 ratios
Learning the Structure and Parameters of Large-Population Graphical Games from Behavioral Data
We consider learning, from strictly behavioral data, the structure and
parameters of linear influence games (LIGs), a class of parametric graphical
games introduced by Irfan and Ortiz (2014). LIGs facilitate causal strategic
inference (CSI): Making inferences from causal interventions on stable behavior
in strategic settings. Applications include the identification of the most
influential individuals in large (social) networks. Such tasks can also support
policy-making analysis. Motivated by the computational work on LIGs, we cast
the learning problem as maximum-likelihood estimation (MLE) of a generative
model defined by pure-strategy Nash equilibria (PSNE). Our simple formulation
uncovers the fundamental interplay between goodness-of-fit and model
complexity: good models capture equilibrium behavior within the data while
controlling the true number of equilibria, including those unobserved. We
provide a generalization bound establishing the sample complexity for MLE in
our framework. We propose several algorithms including convex loss minimization
(CLM) and sigmoidal approximations. We prove that the number of exact PSNE in
LIGs is small, with high probability; thus, CLM is sound. We illustrate our
approach on synthetic data and real-world U.S. congressional voting records. We
briefly discuss our learning framework's generality and potential applicability
to general graphical games.Comment: Journal of Machine Learning Research. (accepted, pending
publication.) Last conference version: submitted March 30, 2012 to UAI 2012.
First conference version: entitled, Learning Influence Games, initially
submitted on June 1, 2010 to NIPS 201
Timescales of Massive Human Entrainment
The past two decades have seen an upsurge of interest in the collective
behaviors of complex systems composed of many agents entrained to each other
and to external events. In this paper, we extend concepts of entrainment to the
dynamics of human collective attention. We conducted a detailed investigation
of the unfolding of human entrainment - as expressed by the content and
patterns of hundreds of thousands of messages on Twitter - during the 2012 US
presidential debates. By time locking these data sources, we quantify the
impact of the unfolding debate on human attention. We show that collective
social behavior covaries second-by-second to the interactional dynamics of the
debates: A candidate speaking induces rapid increases in mentions of his name
on social media and decreases in mentions of the other candidate. Moreover,
interruptions by an interlocutor increase the attention received. We also
highlight a distinct time scale for the impact of salient moments in the
debate: Mentions in social media start within 5-10 seconds after the moment;
peak at approximately one minute; and slowly decay in a consistent fashion
across well-known events during the debates. Finally, we show that public
attention after an initial burst slowly decays through the course of the
debates. Thus we demonstrate that large-scale human entrainment may hold across
a number of distinct scales, in an exquisitely time-locked fashion. The methods
and results pave the way for careful study of the dynamics and mechanisms of
large-scale human entrainment.Comment: 20 pages, 7 figures, 6 tables, 4 supplementary figures. 2nd version
revised according to peer reviewers' comments: more detailed explanation of
the methods, and grounding of the hypothese
A new paradigm of knowledge management: Crowdsourcing as emergent research and development
Drawing from knowledge management theory, this paper argues that the knowledge aggregation problem poses a fundamental constraint to knowledge creation and innovation, and offers a potential solution to this problem. Specific consequences of innovation failure include the failure of research and development to deliver new medicines to address threats such as widespread and increasing antibiotic resistance, the rise of airborne multidrug-resistant or totally drug-resistant tuberculosis, as well as a lack of new drugs to deal with emerging threats such as Ebola. Persistent constraints to knowledge creation exist in the form of market failure, or the failure of profit-seeking models of innovation to internalise the positive externalities associated with innovations, as well as academic failure, or the failure of academic research to provide much needed innovations to address societal problems. However, a lack of theory exists as to how to transcend these constraints to knowledge aggregation. This paper presents a probabilistic theoretical framework of innovation, suggesting that the âwisdom of the crowdâ, or emergent properties of problem-solving, may emerge as a function of scale when crowdsourcing principles are applied to research and development. It is argued in this paper that the consequences of a lack of knowledge of innovation failure are already upon us, and that a radical new approach to knowledge management and innovation is needed.Keywords: probabilistic innovation, knowledge management, innovation, crowdsourcing, crowdsourced R&
- âŠ