388 research outputs found
Modeling the structure and evolution of discussion cascades
We analyze the structure and evolution of discussion cascades in four popular
websites: Slashdot, Barrapunto, Meneame and Wikipedia. Despite the big
heterogeneities between these sites, a preferential attachment (PA) model with
bias to the root can capture the temporal evolution of the observed trees and
many of their statistical properties, namely, probability distributions of the
branching factors (degrees), subtree sizes and certain correlations. The
parameters of the model are learned efficiently using a novel maximum
likelihood estimation scheme for PA and provide a figurative interpretation
about the communication habits and the resulting discussion cascades on the
four different websites.Comment: 10 pages, 11 figure
Analysis of group evolution prediction in complex networks
In the world, in which acceptance and the identification with social
communities are highly desired, the ability to predict evolution of groups over
time appears to be a vital but very complex research problem. Therefore, we
propose a new, adaptable, generic and mutli-stage method for Group Evolution
Prediction (GEP) in complex networks, that facilitates reasoning about the
future states of the recently discovered groups. The precise GEP modularity
enabled us to carry out extensive and versatile empirical studies on many
real-world complex / social networks to analyze the impact of numerous setups
and parameters like time window type and size, group detection method,
evolution chain length, prediction models, etc. Additionally, many new
predictive features reflecting the group state at a given time have been
identified and tested. Some other research problems like enriching learning
evolution chains with external data have been analyzed as well
A hybrid algorithm for Bayesian network structure learning with application to multi-label learning
We present a novel hybrid algorithm for Bayesian network structure learning,
called H2PC. It first reconstructs the skeleton of a Bayesian network and then
performs a Bayesian-scoring greedy hill-climbing search to orient the edges.
The algorithm is based on divide-and-conquer constraint-based subroutines to
learn the local structure around a target variable. We conduct two series of
experimental comparisons of H2PC against Max-Min Hill-Climbing (MMHC), which is
currently the most powerful state-of-the-art algorithm for Bayesian network
structure learning. First, we use eight well-known Bayesian network benchmarks
with various data sizes to assess the quality of the learned structure returned
by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in
terms of goodness of fit to new data and quality of the network structure with
respect to the true dependence structure of the data. Second, we investigate
H2PC's ability to solve the multi-label learning problem. We provide
theoretical results to characterize and identify graphically the so-called
minimal label powersets that appear as irreducible factors in the joint
distribution under the faithfulness condition. The multi-label learning problem
is then decomposed into a series of multi-class classification problems, where
each multi-class variable encodes a label powerset. H2PC is shown to compare
favorably to MMHC in terms of global classification accuracy over ten
multi-label data sets covering different application domains. Overall, our
experiments support the conclusions that local structural learning with H2PC in
the form of local neighborhood induction is a theoretically well-motivated and
empirically effective learning framework that is well suited to multi-label
learning. The source code (in R) of H2PC as well as all data sets used for the
empirical tests are publicly available.Comment: arXiv admin note: text overlap with arXiv:1101.5184 by other author
Social Dynamics of Digg
Online social media provide multiple ways to find interesting content. One
important method is highlighting content recommended by user's friends. We
examine this process on one such site, the news aggregator Digg. With a
stochastic model of user behavior, we distinguish the effects of the content
visibility and interestingness to users. We find a wide range of interest and
distinguish stories primarily of interest to a users' friends from those of
interest to the entire user community. We show how this model predicts a
story's eventual popularity from users' early reactions to it, and estimate the
prediction reliability. This modeling framework can help evaluate alternative
design choices for displaying content on the site.Comment: arXiv admin note: text overlap with arXiv:1010.023
A Survey on True-reputation Algorithm for Trustworthy Online Rating System
The average of customer ratings on a product, which we call a reputation, is one of the key factors in online shoping. The common way for customers to express their satisfaction level with their purchases is through online ratings. The overall buyer?s satisfaction is quantified as the aggregated score of all ratings and is available to all buyers. This average score and reputation of a product acts as a guide for online buyers and highly influences consumer?s final purchase decisions. The trustworthiness of a reputation can be achieved when a large number of buyers involved in ratings with honesty. If some users wantedly give unfair ratings to a item, especially when few users have participated, the reputation of the product could easily be modified. In order to improve the trustworthiness of the products in e-commerce sites a new model is proposed with a true - reputation algorithm that repeatedly adjusts the reputation based on the confidence of the user ratings
- …