1,520 research outputs found
Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis
Database theory and database practice are typically the domain of computer
scientists who adopt what may be termed an algorithmic perspective on their
data. This perspective is very different than the more statistical perspective
adopted by statisticians, scientific computers, machine learners, and other who
work on what may be broadly termed statistical data analysis. In this article,
I will address fundamental aspects of this algorithmic-statistical disconnect,
with an eye to bridging the gap between these two very different approaches. A
concept that lies at the heart of this disconnect is that of statistical
regularization, a notion that has to do with how robust is the output of an
algorithm to the noise properties of the input data. Although it is nearly
completely absent from computer science, which historically has taken the input
data as given and modeled algorithms discretely, regularization in one form or
another is central to nearly every application domain that applies algorithms
to noisy data. By using several case studies, I will illustrate, both
theoretically and empirically, the nonobvious fact that approximate
computation, in and of itself, can implicitly lead to statistical
regularization. This and other recent work suggests that, by exploiting in a
more principled way the statistical properties implicit in worst-case
algorithms, one can in many cases satisfy the bicriteria of having algorithms
that are scalable to very large-scale databases and that also have good
inferential or predictive properties.Comment: To appear in the Proceedings of the 2012 ACM Symposium on Principles
of Database Systems (PODS 2012
Random walks on temporal networks
Many natural and artificial networks evolve in time. Nodes and connections
appear and disappear at various timescales, and their dynamics has profound
consequences for any processes in which they are involved. The first empirical
analysis of the temporal patterns characterizing dynamic networks are still
recent, so that many questions remain open. Here, we study how random walks, as
paradigm of dynamical processes, unfold on temporally evolving networks. To
this aim, we use empirical dynamical networks of contacts between individuals,
and characterize the fundamental quantities that impact any general process
taking place upon them. Furthermore, we introduce different randomizing
strategies that allow us to single out the role of the different properties of
the empirical networks. We show that the random walk exploration is slower on
temporal networks than it is on the aggregate projected network, even when the
time is properly rescaled. In particular, we point out that a fundamental role
is played by the temporal correlations between consecutive contacts present in
the data. Finally, we address the consequences of the intrinsically limited
duration of many real world dynamical networks. Considering the fundamental
prototypical role of the random walk process, we believe that these results
could help to shed light on the behavior of more complex dynamics on temporally
evolving networks.Comment: 14 pages, 13 figure
Theories for influencer identification in complex networks
In social and biological systems, the structural heterogeneity of interaction
networks gives rise to the emergence of a small set of influential nodes, or
influencers, in a series of dynamical processes. Although much smaller than the
entire network, these influencers were observed to be able to shape the
collective dynamics of large populations in different contexts. As such, the
successful identification of influencers should have profound implications in
various real-world spreading dynamics such as viral marketing, epidemic
outbreaks and cascading failure. In this chapter, we first summarize the
centrality-based approach in finding single influencers in complex networks,
and then discuss the more complicated problem of locating multiple influencers
from a collective point of view. Progress rooted in collective influence
theory, belief-propagation and computer science will be presented. Finally, we
present some applications of influencer identification in diverse real-world
systems, including online social platforms, scientific publication, brain
networks and socioeconomic systems.Comment: 24 pages, 6 figure
Interests Diffusion in Social Networks
Understanding cultural phenomena on Social Networks (SNs) and exploiting the
implicit knowledge about their members is attracting the interest of different
research communities both from the academic and the business side. The
community of complexity science is devoting significant efforts to define laws,
models, and theories, which, based on acquired knowledge, are able to predict
future observations (e.g. success of a product). In the mean time, the semantic
web community aims at engineering a new generation of advanced services by
defining constructs, models and methods, adding a semantic layer to SNs. In
this context, a leapfrog is expected to come from a hybrid approach merging the
disciplines above. Along this line, this work focuses on the propagation of
individual interests in social networks. The proposed framework consists of the
following main components: a method to gather information about the members of
the social networks; methods to perform some semantic analysis of the Domain of
Interest; a procedure to infer members' interests; and an interests evolution
theory to predict how the interests propagate in the network. As a result, one
achieves an analytic tool to measure individual features, such as members'
susceptibilities and authorities. Although the approach applies to any type of
social network, here it is has been tested against the computer science
research community.
The DBLP (Digital Bibliography and Library Project) database has been elected
as test-case since it provides the most comprehensive list of scientific
production in this field.Comment: 30 pages 13 figs 4 table
Multi-Level Modeling of Quotation Families Morphogenesis
This paper investigates cultural dynamics in social media by examining the
proliferation and diversification of clearly-cut pieces of content: quoted
texts. In line with the pioneering work of Leskovec et al. and Simmons et al.
on memes dynamics we investigate in deep the transformations that quotations
published online undergo during their diffusion. We deliberately put aside the
structure of the social network as well as the dynamical patterns pertaining to
the diffusion process to focus on the way quotations are changed, how often
they are modified and how these changes shape more or less diverse families and
sub-families of quotations. Following a biological metaphor, we try to
understand in which way mutations can transform quotations at different scales
and how mutation rates depend on various properties of the quotations.Comment: Published in the Proceedings of the ASE/IEEE 4th Intl. Conf. on
Social Computing "SocialCom 2012", Sep. 3-5, 2012, Amsterdam, N
- …