1,155 research outputs found

    Fusing Data with Correlations

    Full text link
    Many applications rely on Web data and extraction systems to accomplish knowledge-driven tasks. Web information is not curated, so many sources provide inaccurate, or conflicting information. Moreover, extraction systems introduce additional noise to the data. We wish to automatically distinguish correct data and erroneous data for creating a cleaner set of integrated data. Previous work has shown that a na\"ive voting strategy that trusts data provided by the majority or at least a certain number of sources may not work well in the presence of copying between the sources. However, correlation between sources can be much broader than copying: sources may provide data from complementary domains (\emph{negative correlation}), extractors may focus on different types of information (\emph{negative correlation}), and extractors may apply common rules in extraction (\emph{positive correlation, without copying}). In this paper we present novel techniques modeling correlations between sources and applying it in truth finding.Comment: Sigmod'201

    Experiments on Dynamic Parallel Magnetism in Superfluid 3He

    Get PDF
    Observations are reported of the ringing of parallel magnetization in superfluid 3He when an incremental magnetic field parallel to a steady field is suddenly turned off

    Information Horizons in Networks

    Full text link
    We investigate and quantify the interplay between topology and ability to send specific signals in complex networks. We find that in a majority of investigated real-world networks the ability to communicate is favored by the network topology on small distances, but disfavored at larger distances. We further discuss how the ability to locate specific nodes can be improved if information associated to the overall traffic in the network is available.Comment: Submitted top PR

    Activity ageing in growing networks

    Get PDF
    We present a model for growing information networks where the ageing of a node depends on the time at which it entered the network and on the last time it was cited. The model is shown to undergo a transition from a small-world to large-world network. The degree distribution may exhibit very different shapes depending on the model parameters, e.g. delta-peaked, exponential or power-law tailed distributions.Comment: 9 pages, 2 figure

    Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search

    Full text link
    We present a framework for quantifying and mitigating algorithmic bias in mechanisms designed for ranking individuals, typically used as part of web-scale search and recommendation systems. We first propose complementary measures to quantify bias with respect to protected attributes such as gender and age. We then present algorithms for computing fairness-aware re-ranking of results. For a given search or recommendation task, our algorithms seek to achieve a desired distribution of top ranked results with respect to one or more protected attributes. We show that such a framework can be tailored to achieve fairness criteria such as equality of opportunity and demographic parity depending on the choice of the desired distribution. We evaluate the proposed algorithms via extensive simulations over different parameter choices, and study the effect of fairness-aware ranking on both bias and utility measures. We finally present the online A/B testing results from applying our framework towards representative ranking in LinkedIn Talent Search, and discuss the lessons learned in practice. Our approach resulted in tremendous improvement in the fairness metrics (nearly three fold increase in the number of search queries with representative results) without affecting the business metrics, which paved the way for deployment to 100% of LinkedIn Recruiter users worldwide. Ours is the first large-scale deployed framework for ensuring fairness in the hiring domain, with the potential positive impact for more than 630M LinkedIn members.Comment: This paper has been accepted for publication at ACM KDD 201

    Identifying communities by influence dynamics in social networks

    Full text link
    Communities are not static; they evolve, split and merge, appear and disappear, i.e. they are product of dynamical processes that govern the evolution of the network. A good algorithm for community detection should not only quantify the topology of the network, but incorporate the dynamical processes that take place on the network. We present a novel algorithm for community detection that combines network structure with processes that support creation and/or evolution of communities. The algorithm does not embrace the universal approach but instead tries to focus on social networks and model dynamic social interactions that occur on those networks. It identifies leaders, and communities that form around those leaders. It naturally supports overlapping communities by associating each node with a membership vector that describes node's involvement in each community. This way, in addition to overlapping communities, we can identify nodes that are good followers to their leader, and also nodes with no clear community involvement that serve as a proxy between several communities and are equally as important. We run the algorithm for several real social networks which we believe represent a good fraction of the wide body of social networks and discuss the results including other possible applications.Comment: 10 pages, 6 figure

    Log-Networks

    Full text link
    We introduce a growing network model in which a new node attaches to a randomly-selected node, as well as to all ancestors of the target node. This mechanism produces a sparse, ultra-small network where the average node degree grows logarithmically with network size while the network diameter equals 2. We determine basic geometrical network properties, such as the size dependence of the number of links and the in- and out-degree distributions. We also compare our predictions with real networks where the node degree also grows slowly with time -- the Internet and the citation network of all Physical Review papers.Comment: 7 pages, 6 figures, 2-column revtex4 format. Version 2: minor changes in response to referee comments and to another proofreading; final version for PR

    Asymptotic behavior of the Kleinberg model

    Full text link
    We study Kleinberg navigation (the search of a target in a d-dimensional lattice, where each site is connected to one other random site at distance r, with probability proportional to r^{-a}) by means of an exact master equation for the process. We show that the asymptotic scaling behavior for the delivery time T to a target at distance L scales as (ln L)^2 when a=d, and otherwise as L^x, with x=(d-a)/(d+1-a) for ad+1. These values of x exceed the rigorous lower-bounds established by Kleinberg. We also address the situation where there is a finite probability for the message to get lost along its way and find short delivery times (conditioned upon arrival) for a wide range of a's

    Small world yields the most effective information spreading

    Get PDF
    Spreading dynamics of information and diseases are usually analyzed by using a unified framework and analogous models. In this paper, we propose a model to emphasize the essential difference between information spreading and epidemic spreading, where the memory effects, the social reinforcement and the non-redundancy of contacts are taken into account. Under certain conditions, the information spreads faster and broader in regular networks than in random networks, which to some extent supports the recent experimental observation of spreading in online society [D. Centola, Science {\bf 329}, 1194 (2010)]. At the same time, simulation result indicates that the random networks tend to be favorable for effective spreading when the network size increases. This challenges the validity of the above-mentioned experiment for large-scale systems. More significantly, we show that the spreading effectiveness can be sharply enhanced by introducing a little randomness into the regular structure, namely the small-world networks yield the most effective information spreading. Our work provides insights to the understanding of the role of local clustering in information spreading.Comment: 6 pages, 7 figures, accepted by New J. Phy

    The Influence of Early Respondents: Information Cascade Effects in Online Event Scheduling

    Full text link
    Sequential group decision-making processes, such as online event scheduling, can be subject to social influence if the decisions involve individuals’ subjective preferences and values. Indeed, prior work has shown that scheduling polls that allow respondents to see others’ answers are more likely to succeed than polls that hide other responses, suggesting the impact of social influence and coordination. In this paper, we investigate whether this difference is due to information cascade effects in which later respondents adopt the decisions of earlier respondents. Analyzing more than 1.3 million Doodle polls, we found evidence that cascading effects take place during event scheduling, and in particular, that early respondents have a larger influence on the outcome of a poll than people who come late. Drawing on simulations of an event scheduling model, we compare possible interventions to mitigate this bias and show that we can optimize the success of polls by hiding the responses of a small percentage of low availability respondents.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134703/1/Romero et al 2017 (WSDM).pd
    • …
    corecore