655 research outputs found

    Discriminative Distance-Based Network Indices with Application to Link Prediction

    Full text link
    In large networks, using the length of shortest paths as the distance measure has shortcomings. A well-studied shortcoming is that extending it to disconnected graphs and directed graphs is controversial. The second shortcoming is that a huge number of vertices may have exactly the same score. The third shortcoming is that in many applications, the distance between two vertices not only depends on the length of shortest paths, but also on the number of shortest paths. In this paper, first we develop a new distance measure between vertices of a graph that yields discriminative distance-based centrality indices. This measure is proportional to the length of shortest paths and inversely proportional to the number of shortest paths. We present algorithms for exact computation of the proposed discriminative indices. Second, we develop randomized algorithms that precisely estimate average discriminative path length and average discriminative eccentricity and show that they give (ϵ,δ)(\epsilon,\delta)-approximations of these indices. Third, we perform extensive experiments over several real-world networks from different domains. In our experiments, we first show that compared to the traditional indices, discriminative indices have usually much more discriminability. Then, we show that our randomized algorithms can very precisely estimate average discriminative path length and average discriminative eccentricity, using only few samples. Then, we show that real-world networks have usually a tiny average discriminative path length, bounded by a constant (e.g., 2). Fourth, in order to better motivate the usefulness of our proposed distance measure, we present a novel link prediction method, that uses discriminative distance to decide which vertices are more likely to form a link in future, and show its superior performance compared to the well-known existing measures

    A Simple Proof of a New Set Disjointness with Applications to Data Streams

    Get PDF

    Budget-Feasible Mechanism Design for Non-Monotone Submodular Objectives: Offline and Online

    Get PDF
    The framework of budget-feasible mechanism design studies procurement auctions where the auctioneer (buyer) aims to maximize his valuation function subject to a hard budget constraint. We study the problem of designing truthful mechanisms that have good approximation guarantees and never pay the participating agents (sellers) more than the budget. We focus on the case of general (non-monotone) submodular valuation functions and derive the first truthful, budget-feasible and O(1)O(1)-approximate mechanisms that run in polynomial time in the value query model, for both offline and online auctions. Prior to our work, the only O(1)O(1)-approximation mechanism known for non-monotone submodular objectives required an exponential number of value queries. At the heart of our approach lies a novel greedy algorithm for non-monotone submodular maximization under a knapsack constraint. Our algorithm builds two candidate solutions simultaneously (to achieve a good approximation), yet ensures that agents cannot jump from one solution to the other (to implicitly enforce truthfulness). Ours is the first mechanism for the problem where---crucially---the agents are not ordered with respect to their marginal value per cost. This allows us to appropriately adapt these ideas to the online setting as well. To further illustrate the applicability of our approach, we also consider the case where additional feasibility constraints are present. We obtain O(p)O(p)-approximation mechanisms for both monotone and non-monotone submodular objectives, when the feasible solutions are independent sets of a pp-system. With the exception of additive valuation functions, no mechanisms were known for this setting prior to our work. Finally, we provide lower bounds suggesting that, when one cares about non-trivial approximation guarantees in polynomial time, our results are asymptotically best possible.Comment: Accepted to EC 201

    Coresets for Relational Data and The Applications

    Full text link
    A coreset is a small set that can approximately preserve the structure of the original input data set. Therefore we can run our algorithm on a coreset so as to reduce the total computational complexity. Conventional coreset techniques assume that the input data set is available to process explicitly. However, this assumption may not hold in real-world scenarios. In this paper, we consider the problem of coresets construction over relational data. Namely, the data is decoupled into several relational tables, and it could be very expensive to directly materialize the data matrix by joining the tables. We propose a novel approach called ``aggregation tree with pseudo-cube'' that can build a coreset from bottom to up. Moreover, our approach can neatly circumvent several troublesome issues of relational learning problems [Khamis et al., PODS 2019]. Under some mild assumptions, we show that our coreset approach can be applied for the machine learning tasks, such as clustering, logistic regression and SVM
    • …
    corecore