1,155 research outputs found
Fusing Data with Correlations
Many applications rely on Web data and extraction systems to accomplish
knowledge-driven tasks. Web information is not curated, so many sources provide
inaccurate, or conflicting information. Moreover, extraction systems introduce
additional noise to the data. We wish to automatically distinguish correct data
and erroneous data for creating a cleaner set of integrated data. Previous work
has shown that a na\"ive voting strategy that trusts data provided by the
majority or at least a certain number of sources may not work well in the
presence of copying between the sources. However, correlation between sources
can be much broader than copying: sources may provide data from complementary
domains (\emph{negative correlation}), extractors may focus on different types
of information (\emph{negative correlation}), and extractors may apply common
rules in extraction (\emph{positive correlation, without copying}). In this
paper we present novel techniques modeling correlations between sources and
applying it in truth finding.Comment: Sigmod'201
Experiments on Dynamic Parallel Magnetism in Superfluid 3He
Observations are reported of the ringing of parallel magnetization in superfluid 3He when an incremental magnetic field parallel to a steady field is suddenly turned off
Information Horizons in Networks
We investigate and quantify the interplay between topology and ability to
send specific signals in complex networks. We find that in a majority of
investigated real-world networks the ability to communicate is favored by the
network topology on small distances, but disfavored at larger distances. We
further discuss how the ability to locate specific nodes can be improved if
information associated to the overall traffic in the network is available.Comment: Submitted top PR
Activity ageing in growing networks
We present a model for growing information networks where the ageing of a
node depends on the time at which it entered the network and on the last time
it was cited. The model is shown to undergo a transition from a small-world to
large-world network. The degree distribution may exhibit very different shapes
depending on the model parameters, e.g. delta-peaked, exponential or power-law
tailed distributions.Comment: 9 pages, 2 figure
Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search
We present a framework for quantifying and mitigating algorithmic bias in
mechanisms designed for ranking individuals, typically used as part of
web-scale search and recommendation systems. We first propose complementary
measures to quantify bias with respect to protected attributes such as gender
and age. We then present algorithms for computing fairness-aware re-ranking of
results. For a given search or recommendation task, our algorithms seek to
achieve a desired distribution of top ranked results with respect to one or
more protected attributes. We show that such a framework can be tailored to
achieve fairness criteria such as equality of opportunity and demographic
parity depending on the choice of the desired distribution. We evaluate the
proposed algorithms via extensive simulations over different parameter choices,
and study the effect of fairness-aware ranking on both bias and utility
measures. We finally present the online A/B testing results from applying our
framework towards representative ranking in LinkedIn Talent Search, and discuss
the lessons learned in practice. Our approach resulted in tremendous
improvement in the fairness metrics (nearly three fold increase in the number
of search queries with representative results) without affecting the business
metrics, which paved the way for deployment to 100% of LinkedIn Recruiter users
worldwide. Ours is the first large-scale deployed framework for ensuring
fairness in the hiring domain, with the potential positive impact for more than
630M LinkedIn members.Comment: This paper has been accepted for publication at ACM KDD 201
Identifying communities by influence dynamics in social networks
Communities are not static; they evolve, split and merge, appear and
disappear, i.e. they are product of dynamical processes that govern the
evolution of the network. A good algorithm for community detection should not
only quantify the topology of the network, but incorporate the dynamical
processes that take place on the network. We present a novel algorithm for
community detection that combines network structure with processes that support
creation and/or evolution of communities. The algorithm does not embrace the
universal approach but instead tries to focus on social networks and model
dynamic social interactions that occur on those networks. It identifies
leaders, and communities that form around those leaders. It naturally supports
overlapping communities by associating each node with a membership vector that
describes node's involvement in each community. This way, in addition to
overlapping communities, we can identify nodes that are good followers to their
leader, and also nodes with no clear community involvement that serve as a
proxy between several communities and are equally as important. We run the
algorithm for several real social networks which we believe represent a good
fraction of the wide body of social networks and discuss the results including
other possible applications.Comment: 10 pages, 6 figure
Log-Networks
We introduce a growing network model in which a new node attaches to a
randomly-selected node, as well as to all ancestors of the target node. This
mechanism produces a sparse, ultra-small network where the average node degree
grows logarithmically with network size while the network diameter equals 2. We
determine basic geometrical network properties, such as the size dependence of
the number of links and the in- and out-degree distributions. We also compare
our predictions with real networks where the node degree also grows slowly with
time -- the Internet and the citation network of all Physical Review papers.Comment: 7 pages, 6 figures, 2-column revtex4 format. Version 2: minor changes
in response to referee comments and to another proofreading; final version
for PR
Asymptotic behavior of the Kleinberg model
We study Kleinberg navigation (the search of a target in a d-dimensional
lattice, where each site is connected to one other random site at distance r,
with probability proportional to r^{-a}) by means of an exact master equation
for the process. We show that the asymptotic scaling behavior for the delivery
time T to a target at distance L scales as (ln L)^2 when a=d, and otherwise as
L^x, with x=(d-a)/(d+1-a) for ad+1. These
values of x exceed the rigorous lower-bounds established by Kleinberg. We also
address the situation where there is a finite probability for the message to
get lost along its way and find short delivery times (conditioned upon arrival)
for a wide range of a's
Small world yields the most effective information spreading
Spreading dynamics of information and diseases are usually analyzed by using
a unified framework and analogous models. In this paper, we propose a model to
emphasize the essential difference between information spreading and epidemic
spreading, where the memory effects, the social reinforcement and the
non-redundancy of contacts are taken into account. Under certain conditions,
the information spreads faster and broader in regular networks than in random
networks, which to some extent supports the recent experimental observation of
spreading in online society [D. Centola, Science {\bf 329}, 1194 (2010)]. At
the same time, simulation result indicates that the random networks tend to be
favorable for effective spreading when the network size increases. This
challenges the validity of the above-mentioned experiment for large-scale
systems. More significantly, we show that the spreading effectiveness can be
sharply enhanced by introducing a little randomness into the regular structure,
namely the small-world networks yield the most effective information spreading.
Our work provides insights to the understanding of the role of local clustering
in information spreading.Comment: 6 pages, 7 figures, accepted by New J. Phy
The Influence of Early Respondents: Information Cascade Effects in Online Event Scheduling
Sequential group decision-making processes, such as online event scheduling, can be subject to social influence if the decisions involve individuals’ subjective preferences and values. Indeed, prior work has shown that scheduling polls that allow respondents to see others’ answers are more likely to succeed than polls that hide other responses, suggesting the impact of social influence and coordination. In this paper, we investigate whether this difference is due to information cascade effects in which later respondents adopt the decisions of earlier respondents. Analyzing more than 1.3 million Doodle polls, we found evidence that cascading effects take place during event scheduling, and in particular, that early respondents have a larger influence on the outcome of a poll than people who come late. Drawing on simulations of an event scheduling model, we compare possible interventions to mitigate this bias and show that we can optimize the success of polls by hiding the responses of a small percentage of low availability respondents.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134703/1/Romero et al 2017 (WSDM).pd
- …