Search CORE

1,520 research outputs found

Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis

Author: Mahoney Michael W.
Publication venue
Publication date: 01/01/2012
Field of study

Database theory and database practice are typically the domain of computer scientists who adopt what may be termed an algorithmic perspective on their data. This perspective is very different than the more statistical perspective adopted by statisticians, scientific computers, machine learners, and other who work on what may be broadly termed statistical data analysis. In this article, I will address fundamental aspects of this algorithmic-statistical disconnect, with an eye to bridging the gap between these two very different approaches. A concept that lies at the heart of this disconnect is that of statistical regularization, a notion that has to do with how robust is the output of an algorithm to the noise properties of the input data. Although it is nearly completely absent from computer science, which historically has taken the input data as given and modeled algorithms discretely, regularization in one form or another is central to nearly every application domain that applies algorithms to noisy data. By using several case studies, I will illustrate, both theoretically and empirically, the nonobvious fact that approximate computation, in and of itself, can implicitly lead to statistical regularization. This and other recent work suggests that, by exploiting in a more principled way the statistical properties implicit in worst-case algorithms, one can in many cases satisfy the bicriteria of having algorithms that are scalable to very large-scale databases and that also have good inferential or predictive properties.Comment: To appear in the Proceedings of the 2012 ACM Symposium on Principles of Database Systems (PODS 2012

arXiv.org e-Print Archive

CiteSeerX

Random walks on temporal networks

Author: Baronchelli Andrea
Barrat Alain
Pastor-Satorras Romualdo
Starnini Michele
Publication venue
Publication date: 01/01/2012
Field of study

Many natural and artificial networks evolve in time. Nodes and connections appear and disappear at various timescales, and their dynamics has profound consequences for any processes in which they are involved. The first empirical analysis of the temporal patterns characterizing dynamic networks are still recent, so that many questions remain open. Here, we study how random walks, as paradigm of dynamical processes, unfold on temporally evolving networks. To this aim, we use empirical dynamical networks of contacts between individuals, and characterize the fundamental quantities that impact any general process taking place upon them. Furthermore, we introduce different randomizing strategies that allow us to single out the role of the different properties of the empirical networks. We show that the random walk exploration is slower on temporal networks than it is on the aggregate projected network, even when the time is properly rescaled. In particular, we point out that a fundamental role is played by the temporal correlations between consecutive contacts present in the data. Finally, we address the consequences of the intrinsically limited duration of many real world dynamical networks. Considering the fundamental prototypical role of the random walk process, we believe that these results could help to shed light on the behavior of more complex dynamics on temporally evolving networks.Comment: 14 pages, 13 figure

arXiv.org e-Print Archive

City Research Online

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

HAL AMU

Theories for influencer identification in complex networks

In social and biological systems, the structural heterogeneity of interaction networks gives rise to the emergence of a small set of influential nodes, or influencers, in a series of dynamical processes. Although much smaller than the entire network, these influencers were observed to be able to shape the collective dynamics of large populations in different contexts. As such, the successful identification of influencers should have profound implications in various real-world spreading dynamics such as viral marketing, epidemic outbreaks and cascading failure. In this chapter, we first summarize the centrality-based approach in finding single influencers in complex networks, and then discuss the more complicated problem of locating multiple influencers from a collective point of view. Progress rooted in collective influence theory, belief-propagation and computer science will be presented. Finally, we present some applications of influencer identification in diverse real-world systems, including online social platforms, scientific publication, brain networks and socioeconomic systems.Comment: 24 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Interests Diffusion in Social Networks

Author: D'Agostino Gregorio
D'Antonio Fulvio
De Nicola Antonio
Tucci Salvatore
Publication venue
Publication date: 08/01/2015
Field of study

Understanding cultural phenomena on Social Networks (SNs) and exploiting the implicit knowledge about their members is attracting the interest of different research communities both from the academic and the business side. The community of complexity science is devoting significant efforts to define laws, models, and theories, which, based on acquired knowledge, are able to predict future observations (e.g. success of a product). In the mean time, the semantic web community aims at engineering a new generation of advanced services by defining constructs, models and methods, adding a semantic layer to SNs. In this context, a leapfrog is expected to come from a hybrid approach merging the disciplines above. Along this line, this work focuses on the propagation of individual interests in social networks. The proposed framework consists of the following main components: a method to gather information about the members of the social networks; methods to perform some semantic analysis of the Domain of Interest; a procedure to infer members' interests; and an interests evolution theory to predict how the interests propagate in the network. As a result, one achieves an analytic tool to measure individual features, such as members' susceptibilities and authorities. Although the approach applies to any type of social network, here it is has been tested against the computer science research community. The DBLP (Digital Bibliography and Library Project) database has been elected as test-case since it provides the most comprehensive list of scientific production in this field.Comment: 30 pages 13 figs 4 table

arXiv.org e-Print Archive

ART

Multi-Level Modeling of Quotation Families Morphogenesis

Author: Cointet Jean-Philippe
Omodei Elisa
Poibeau Thierry
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

This paper investigates cultural dynamics in social media by examining the proliferation and diversification of clearly-cut pieces of content: quoted texts. In line with the pioneering work of Leskovec et al. and Simmons et al. on memes dynamics we investigate in deep the transformations that quotations published online undergo during their diffusion. We deliberately put aside the structure of the social network as well as the dynamical patterns pertaining to the diffusion process to focus on the way quotations are changed, how often they are modified and how these changes shape more or less diverse families and sub-families of quotations. Following a biological metaphor, we try to understand in which way mutations can transform quotations at different scales and how mutation rates depend on various properties of the quotations.Comment: Published in the Proceedings of the ASE/IEEE 4th Intl. Conf. on Social Computing "SocialCom 2012", Sep. 3-5, 2012, Amsterdam, N

arXiv.org e-Print Archive

Crossref

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM