86,829 research outputs found
Similarity Measure Development for Case-Based Reasoning- A Data-driven Approach
In this paper, we demonstrate a data-driven methodology for modelling the
local similarity measures of various attributes in a dataset. We analyse the
spread in the numerical attributes and estimate their distribution using
polynomial function to showcase an approach for deriving strong initial value
ranges of numerical attributes and use a non-overlapping distribution for
categorical attributes such that the entire similarity range [0,1] is utilized.
We use an open source dataset for demonstrating modelling and development of
the similarity measures and will present a case-based reasoning (CBR) system
that can be used to search for the most relevant similar cases
Network Model Selection for Task-Focused Attributed Network Inference
Networks are models representing relationships between entities. Often these
relationships are explicitly given, or we must learn a representation which
generalizes and predicts observed behavior in underlying individual data (e.g.
attributes or labels). Whether given or inferred, choosing the best
representation affects subsequent tasks and questions on the network. This work
focuses on model selection to evaluate network representations from data,
focusing on fundamental predictive tasks on networks. We present a modular
methodology using general, interpretable network models, task neighborhood
functions found across domains, and several criteria for robust model
selection. We demonstrate our methodology on three online user activity
datasets and show that network model selection for the appropriate network task
vs. an alternate task increases performance by an order of magnitude in our
experiments
Co-Following on Twitter
We present an in-depth study of co-following on Twitter based on the
observation that two Twitter users whose followers have similar friends are
also similar, even though they might not share any direct links or a single
mutual follower. We show how this observation contributes to (i) a better
understanding of language-agnostic user classification on Twitter, (ii)
eliciting opportunities for Computational Social Science, and (iii) improving
online marketing by identifying cross-selling opportunities.
We start with a machine learning problem of predicting a user's preference
among two alternative choices of Twitter friends. We show that co-following
information provides strong signals for diverse classification tasks and that
these signals persist even when (i) the most discriminative features are
removed and (ii) only relatively "sparse" users with fewer than 152 but more
than 43 Twitter friends are considered.
Going beyond mere classification performance optimization, we present
applications of our methodology to Computational Social Science. Here we
confirm stereotypes such as that the country singer Kenny Chesney
(@kennychesney) is more popular among @GOP followers, whereas Lady Gaga
(@ladygaga) enjoys more support from @TheDemocrats followers.
In the domain of marketing we give evidence that celebrity endorsement is
reflected in co-following and we demonstrate how our methodology can be used to
reveal the audience similarities between Apple and Puma and, less obviously,
between Nike and Coca-Cola. Concerning a user's popularity we find a
statistically significant connection between having a more "average"
followership and having more followers than direct rivals. Interestingly, a
\emph{larger} audience also seems to be linked to a \emph{less diverse}
audience in terms of their co-following.Comment: full version of a short paper at Hypertext 201
Recommended from our members
Interactive product catalogue with user preference tracking
In the context of m-commerce, small screen size poses serious difficulty for users to browse effectively through a product catalogue, given the limited number of products that may be presented on-screen. Despite the availability of search engines, filters and recommender systems to aid users, these techniques focus on a narrow segment of product offering. The users are thus denied the opportunity to do a more expansive exploration of the products available. This paper describes a novel approach to overcome the constraints of small screen size. Through integration of a product catalogue with a recommender system, an adaptive system has been created that guides users through the process of product browsing. An original technique has been developed to cluster similar positive examples together to identify areas of interest of a user. The performance of this technique has been evaluated and the results proved to be promising
- …