655 research outputs found
Discriminative Distance-Based Network Indices with Application to Link Prediction
In large networks, using the length of shortest paths as the distance measure
has shortcomings. A well-studied shortcoming is that extending it to
disconnected graphs and directed graphs is controversial. The second
shortcoming is that a huge number of vertices may have exactly the same score.
The third shortcoming is that in many applications, the distance between two
vertices not only depends on the length of shortest paths, but also on the
number of shortest paths. In this paper, first we develop a new distance
measure between vertices of a graph that yields discriminative distance-based
centrality indices. This measure is proportional to the length of shortest
paths and inversely proportional to the number of shortest paths. We present
algorithms for exact computation of the proposed discriminative indices.
Second, we develop randomized algorithms that precisely estimate average
discriminative path length and average discriminative eccentricity and show
that they give -approximations of these indices. Third, we
perform extensive experiments over several real-world networks from different
domains. In our experiments, we first show that compared to the traditional
indices, discriminative indices have usually much more discriminability. Then,
we show that our randomized algorithms can very precisely estimate average
discriminative path length and average discriminative eccentricity, using only
few samples. Then, we show that real-world networks have usually a tiny average
discriminative path length, bounded by a constant (e.g., 2). Fourth, in order
to better motivate the usefulness of our proposed distance measure, we present
a novel link prediction method, that uses discriminative distance to decide
which vertices are more likely to form a link in future, and show its superior
performance compared to the well-known existing measures
Budget-Feasible Mechanism Design for Non-Monotone Submodular Objectives: Offline and Online
The framework of budget-feasible mechanism design studies procurement
auctions where the auctioneer (buyer) aims to maximize his valuation function
subject to a hard budget constraint. We study the problem of designing truthful
mechanisms that have good approximation guarantees and never pay the
participating agents (sellers) more than the budget. We focus on the case of
general (non-monotone) submodular valuation functions and derive the first
truthful, budget-feasible and -approximate mechanisms that run in
polynomial time in the value query model, for both offline and online auctions.
Prior to our work, the only -approximation mechanism known for
non-monotone submodular objectives required an exponential number of value
queries.
At the heart of our approach lies a novel greedy algorithm for non-monotone
submodular maximization under a knapsack constraint. Our algorithm builds two
candidate solutions simultaneously (to achieve a good approximation), yet
ensures that agents cannot jump from one solution to the other (to implicitly
enforce truthfulness). Ours is the first mechanism for the problem
where---crucially---the agents are not ordered with respect to their marginal
value per cost. This allows us to appropriately adapt these ideas to the online
setting as well.
To further illustrate the applicability of our approach, we also consider the
case where additional feasibility constraints are present. We obtain
-approximation mechanisms for both monotone and non-monotone submodular
objectives, when the feasible solutions are independent sets of a -system.
With the exception of additive valuation functions, no mechanisms were known
for this setting prior to our work. Finally, we provide lower bounds suggesting
that, when one cares about non-trivial approximation guarantees in polynomial
time, our results are asymptotically best possible.Comment: Accepted to EC 201
Coresets for Relational Data and The Applications
A coreset is a small set that can approximately preserve the structure of the
original input data set. Therefore we can run our algorithm on a coreset so as
to reduce the total computational complexity. Conventional coreset techniques
assume that the input data set is available to process explicitly. However,
this assumption may not hold in real-world scenarios. In this paper, we
consider the problem of coresets construction over relational data. Namely, the
data is decoupled into several relational tables, and it could be very
expensive to directly materialize the data matrix by joining the tables. We
propose a novel approach called ``aggregation tree with pseudo-cube'' that can
build a coreset from bottom to up. Moreover, our approach can neatly circumvent
several troublesome issues of relational learning problems [Khamis et al., PODS
2019]. Under some mild assumptions, we show that our coreset approach can be
applied for the machine learning tasks, such as clustering, logistic regression
and SVM
- …