2,379 research outputs found
Diversifying Top-K Results
Top-k query processing finds a list of k results that have largest scores
w.r.t the user given query, with the assumption that all the k results are
independent to each other. In practice, some of the top-k results returned can
be very similar to each other. As a result some of the top-k results returned
are redundant. In the literature, diversified top-k search has been studied to
return k results that take both score and diversity into consideration. Most
existing solutions on diversified top-k search assume that scores of all the
search results are given, and some works solve the diversity problem on a
specific problem and can hardly be extended to general cases. In this paper, we
study the diversified top-k search problem. We define a general diversified
top-k search problem that only considers the similarity of the search results
themselves. We propose a framework, such that most existing solutions for top-k
query processing can be extended easily to handle diversified top-k search, by
simply applying three new functions, a sufficient stop condition sufficient(),
a necessary stop condition necessary(), and an algorithm for diversified top-k
search on the current set of generated results, div-search-current(). We
propose three new algorithms, namely, div-astar, div-dp, and div-cut to solve
the div-search-current() problem. div-astar is an A* based algorithm, div-dp is
an algorithm that decomposes the results into components which are searched
using div-astar independently and combined using dynamic programming. div-cut
further decomposes the current set of generated results using cut points and
combines the results using sophisticated operations. We conducted extensive
performance studies using two real datasets, enwiki and reuters. Our div-cut
algorithm finds the optimal solution for diversified top-k search problem in
seconds even for k as large as 2,000.Comment: VLDB201
Evolution of cooperation in spatial traveler's dilemma game
Traveler's dilemma (TD) is one of social dilemmas which has been well studied
in the economics community, but it is attracted little attention in the physics
community. The TD game is a two-person game. Each player can select an integer
value between and () as a pure strategy. If both of them select
the same value, the payoff to them will be that value. If the players select
different values, say and (), then the payoff to the
player who chooses the small value will be and the payoff to the other
player will be . We term the player who selects a large value as the
cooperator, and the one who chooses a small value as the defector. The reason
is that if both of them select large values, it will result in a large total
payoff. The Nash equilibrium of the TD game is to choose the smallest value
. However, in previous behavioral studies, players in TD game typically
select values that are much larger than , and the average selected value
exhibits an inverse relationship with . To explain such anomalous behavior,
in this paper, we study the evolution of cooperation in spatial traveler's
dilemma game where the players are located on a square lattice and each player
plays TD games with his neighbors. Players in our model can adopt their
neighbors' strategies following two standard models of spatial game dynamics.
Monte-Carlo simulation is applied to our model, and the results show that the
cooperation level of the system, which is proportional to the average value of
the strategies, decreases with increasing until is greater than the
threshold where cooperation vanishes. Our findings indicate that spatial
reciprocity promotes the evolution of cooperation in TD game and the spatial TD
game model can interpret the anomalous behavior observed in previous behavioral
experiments
Keyword Search on RDF Graphs - A Query Graph Assembly Approach
Keyword search provides ordinary users an easy-to-use interface for querying
RDF data. Given the input keywords, in this paper, we study how to assemble a
query graph that is to represent user's query intention accurately and
efficiently. Based on the input keywords, we first obtain the elementary query
graph building blocks, such as entity/class vertices and predicate edges. Then,
we formally define the query graph assembly (QGA) problem. Unfortunately, we
prove theoretically that QGA is a NP-complete problem. In order to solve that,
we design some heuristic lower bounds and propose a bipartite graph
matching-based best-first search algorithm. The algorithm's time complexity is
, where is the number of the keywords and is a
tunable parameter, i.e., the maximum number of candidate entity/class vertices
and predicate edges allowed to match each keyword. Although QGA is intractable,
both and are small in practice. Furthermore, the algorithm's time
complexity does not depend on the RDF graph size, which guarantees the good
scalability of our system in large RDF graphs. Experiments on DBpedia and
Freebase confirm the superiority of our system on both effectiveness and
efficiency
Quasi-SLCA based Keyword Query Processing over Probabilistic XML Data
The probabilistic threshold query is one of the most common queries in
uncertain databases, where a result satisfying the query must be also with
probability meeting the threshold requirement. In this paper, we investigate
probabilistic threshold keyword queries (PrTKQ) over XML data, which is not
studied before. We first introduce the notion of quasi-SLCA and use it to
represent results for a PrTKQ with the consideration of possible world
semantics. Then we design a probabilistic inverted (PI) index that can be used
to quickly return the qualified answers and filter out the unqualified ones
based on our proposed lower/upper bounds. After that, we propose two efficient
and comparable algorithms: Baseline Algorithm and PI index-based Algorithm. To
accelerate the performance of algorithms, we also utilize probability density
function. An empirical study using real and synthetic data sets has verified
the effectiveness and the efficiency of our approaches
A Fast Order-Based Approach for Core Maintenance
Graphs have been widely used in many applications such as social networks,
collaboration networks, and biological networks. One important graph analytics
is to explore cohesive subgraphs in a large graph. Among several cohesive
subgraphs studied, k-core is one that can be computed in linear time for a
static graph. Since graphs are evolving in real applications, in this paper, we
study core maintenance which is to reduce the computational cost to compute
k-cores for a graph when graphs are updated from time to time dynamically. We
identify drawbacks of the existing efficient algorithm, which needs a large
search space to find the vertices that need to be updated, and has high
overhead to maintain the index built, when a graph is updated. We propose a new
order-based approach to maintain an order, called k-order, among vertices,
while a graph is updated. Our new algorithm can significantly outperform the
state-of-the-art algorithm up to 3 orders of magnitude for the 11 large real
graphs tested. We report our findings in this paper
- β¦