45,530 research outputs found
Information Gathering in Networks via Active Exploration
How should we gather information in a network, where each node's visibility
is limited to its local neighborhood? This problem arises in numerous
real-world applications, such as surveying and task routing in social networks,
team formation in collaborative networks and experimental design with
dependency constraints. Often the informativeness of a set of nodes can be
quantified via a submodular utility function. Existing approaches for
submodular optimization, however, require that the set of all nodes that can be
selected is known ahead of time, which is often unrealistic. In contrast, we
propose a novel model where we start our exploration from an initial node, and
new nodes become visible and available for selection only once one of their
neighbors has been chosen. We then present a general algorithm NetExp for this
problem, and provide theoretical bounds on its performance dependent on
structural properties of the underlying network. We evaluate our methodology on
various simulated problem instances as well as on data collected from social
question answering system deployed within a large enterprise.Comment: Longer version of IJCAI'15 pape
Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings
In this paper we present a novel interactive multimodal learning system,
which facilitates search and exploration in large networks of social multimedia
users. It allows the analyst to identify and select users of interest, and to
find similar users in an interactive learning setting. Our approach is based on
novel multimodal representations of users, words and concepts, which we
simultaneously learn by deploying a general-purpose neural embedding model. We
show these representations to be useful not only for categorizing users, but
also for automatically generating user and community profiles. Inspired by
traditional summarization approaches, we create the profiles by selecting
diverse and representative content from all available modalities, i.e. the
text, image and user modality. The usefulness of the approach is evaluated
using artificial actors, which simulate user behavior in a relevance feedback
scenario. Multiple experiments were conducted in order to evaluate the quality
of our multimodal representations, to compare different embedding strategies,
and to determine the importance of different modalities. We demonstrate the
capabilities of the proposed approach on two different multimedia collections
originating from the violent online extremism forum Stormfront and the
microblogging platform Twitter, which are particularly interesting due to the
high semantic level of the discussions they feature
Exploration of User Groups in VEXUS
We introduce VEXUS, an interactive visualization framework for exploring user
data to fulfill tasks such as finding a set of experts, forming discussion
groups and analyzing collective behaviors. User data is characterized by a
combination of demographics like age and occupation, and actions such as rating
a movie, writing a paper, following a medical treatment or buying groceries.
The ubiquity of user data requires tools that help explorers, be they
specialists or novice users, acquire new insights. VEXUS lets explorers
interact with user data via visual primitives and builds an exploration profile
to recommend the next exploration steps. VEXUS combines state-of-the-art
visualization techniques with appropriate indexing of user data to provide fast
and relevant exploration
Finding groups in data: Cluster analysis with ants
Wepresent in this paper a modification of Lumer and Faieta’s algorithm for data clustering. This approach
mimics the clustering behavior observed in real ant colonies. This algorithm discovers automatically
clusters in numerical data without prior knowledge of possible number of clusters. In this paper we focus
on ant-based clustering algorithms, a particular kind of a swarm intelligent system, and on the effects on
the final clustering by using during the classification differentmetrics of dissimilarity: Euclidean, Cosine,
and Gower measures. Clustering with swarm-based algorithms is emerging as an alternative to more
conventional clustering methods, such as e.g. k-means, etc. Among the many bio-inspired techniques, ant
clustering algorithms have received special attention, especially because they still require much
investigation to improve performance, stability and other key features that would make such algorithms
mature tools for data mining.
As a case study, this paper focus on the behavior of clustering procedures in those new approaches.
The proposed algorithm and its modifications are evaluated in a number of well-known benchmark
datasets. Empirical results clearly show that ant-based clustering algorithms performs well when
compared to another techniques
Exploring Student Check-In Behavior for Improved Point-of-Interest Prediction
With the availability of vast amounts of user visitation history on
location-based social networks (LBSN), the problem of Point-of-Interest (POI)
prediction has been extensively studied. However, much of the research has been
conducted solely on voluntary checkin datasets collected from social apps such
as Foursquare or Yelp. While these data contain rich information about
recreational activities (e.g., restaurants, nightlife, and entertainment),
information about more prosaic aspects of people's lives is sparse. This not
only limits our understanding of users' daily routines, but more importantly
the modeling assumptions developed based on characteristics of recreation-based
data may not be suitable for richer check-in data. In this work, we present an
analysis of education "check-in" data using WiFi access logs collected at
Purdue University. We propose a heterogeneous graph-based method to encode the
correlations between users, POIs, and activities, and then jointly learn
embeddings for the vertices. We evaluate our method compared to previous
state-of-the-art POI prediction methods, and show that the assumptions made by
previous methods significantly degrade performance on our data with dense(r)
activity signals. We also show how our learned embeddings could be used to
identify similar students (e.g., for friend suggestions).Comment: published in KDD'1
- …