Search CORE

419 research outputs found

Diversifying Top-K Results

Author: Chang Lijun
Qin Lu
Yu Jeffrey Xu
Publication venue
Publication date: 01/01/2012
Field of study

Top-k query processing finds a list of k results that have largest scores w.r.t the user given query, with the assumption that all the k results are independent to each other. In practice, some of the top-k results returned can be very similar to each other. As a result some of the top-k results returned are redundant. In the literature, diversified top-k search has been studied to return k results that take both score and diversity into consideration. Most existing solutions on diversified top-k search assume that scores of all the search results are given, and some works solve the diversity problem on a specific problem and can hardly be extended to general cases. In this paper, we study the diversified top-k search problem. We define a general diversified top-k search problem that only considers the similarity of the search results themselves. We propose a framework, such that most existing solutions for top-k query processing can be extended easily to handle diversified top-k search, by simply applying three new functions, a sufficient stop condition sufficient(), a necessary stop condition necessary(), and an algorithm for diversified top-k search on the current set of generated results, div-search-current(). We propose three new algorithms, namely, div-astar, div-dp, and div-cut to solve the div-search-current() problem. div-astar is an A* based algorithm, div-dp is an algorithm that decomposes the results into components which are searched using div-astar independently and combined using dynamic programming. div-cut further decomposes the current set of generated results using cut points and combines the results using sophisticated operations. We conducted extensive performance studies using two real datasets, enwiki and reuters. Our div-cut algorithm finds the optimal solution for diversified top-k search problem in seconds even for k as large as 2,000.Comment: VLDB201

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

Efficient Maximum $k$ -Defective Clique Computation with Improved Time Complexity

Author: Chang Lijun
Publication venue
Publication date: 05/09/2023
Field of study

k

-defective cliques relax cliques by allowing up-to

k

missing edges from being a complete graph. This relaxation enables us to find larger near-cliques and has applications in link prediction, cluster detection, social network analysis and transportation science. The problem of finding the largest

k

-defective clique has been recently studied with several algorithms being proposed in the literature. However, the currently fastest algorithm KDBB does not improve its time complexity from being the trivial

O(2^n)

, and also, KDBB's practical performance is still not satisfactory. In this paper, we advance the state of the art for exact maximum

k

-defective clique computation, in terms of both time complexity and practical performance. Moreover, we separate the techniques required for achieving the time complexity from others purely used for practical performance consideration; this design choice may facilitate the research community to further improve the practical efficiency while not sacrificing the worst case time complexity. In specific, we first develop a general framework kDC that beats the trivial time complexity of

O(2^n)

and achieves a better time complexity than all existing algorithms. The time complexity of kDC is solely achieved by non-fully-adjacent-first branching rule, excess-removal reduction rule and high-degree reduction rule. Then, to make kDC practically efficient, we further propose a new upper bound, two reduction rules, and an algorithm for efficiently computing a large initial solution. Extensive empirical studies on three benchmark graph collections with

290

graphs in total demonstrate that kDC outperforms the currently fastest algorithm KDBB by several orders of magnitude.Comment: Accepted by SIGMOD 2024 in May 202

arXiv.org e-Print Archive

More is simpler : effectively and efficiently assessing node-pair similarities based on hyperlinks

Author: Chang Lijun
Lin Xuemin
Pei Jian
Yu Weiren
Zhang Wenjie
Publication venue: ACM
Publication date: 01/09/2013
Field of study

Similarity assessment is one of the core tasks in hyperlink analysis. Recently, with the proliferation of applications, e.g., web search and collaborative filtering, SimRank has been a well-studied measure of similarity between two nodes in a graph. It recursively follows the philosophy that "two nodes are similar if they are referenced (have incoming edges) from similar nodes", which can be viewed as an aggregation of similarities based on incoming paths. Despite its popularity, SimRank has an undesirable property, i.e., "zero-similarity": It only accommodates paths with equal length from a common "center" node. Thus, a large portion of other paths are fully ignored. This paper attempts to remedy this issue. (1) We propose and rigorously justify SimRank*, a revised version of SimRank, which resolves such counter-intuitive "zero-similarity" issues while inheriting merits of the basic SimRank philosophy. (2) We show that the series form of SimRank* can be reduced to a fairly succinct and elegant closed form, which looks even simpler than SimRank, yet enriches semantics without suffering from increased computational cost. This leads to a fixed-point iterative paradigm of SimRank* in O(Knm) time on a graph of n nodes and m edges for K iterations, which is comparable to SimRank. (3) To further optimize SimRank* computation, we leverage a novel clustering strategy via edge concentration. Due to its NP-hardness, we devise an efficient and effective heuristic to speed up SimRank* computation to O(Knm) time, where m is generally much smaller than m. (4) Using real and synthetic data, we empirically verify the rich semantics of SimRank*, and demonstrate its high computation efficiency

CiteSeerX

Crossref

Warwick Research Archives Portal Repository

Spiral - Imperial College Digital Repository

Application of Narrative Theory in News Transediting: A Case Study of the International Political News in Reference News

Author: CHANG Zhiyu
LI Lijun
Publication venue: Canadian Academy of Oriental and Occidental Culture
Publication date: 26/09/2022
Field of study

Reference News is the only newspaper on the Chinese mainland that has the legal authority to publish foreign news directly, giving a transediting version of the latest news and comments around the world. Transediting, as a distinct type of translation, must not only faithfully convey the original content but also adapt the original structure in light of the international situation, to fulfill the needs of multiple parties. According to narrative theory, translation is re-narrative, and the original text’s frame of time and space must be reconstructed, which coincides with the method required for transediting. In this paper, the author takes advantage of the narrative theory to analyze the transediting news of the Russia-Ukraine conflict in Reference News, conducting the transediting practice effectively.

CSCanada.net: E-Journals (Canadian Academy of Oriental and Occidental Culture, Canadian Research & Development Center of Sciences and Cultures)

Laser absorption spectroscopy for combustion diagnosis in reactive flows: A review

Author: Liu Chang
Xu Lijun
Publication venue: 'Informa UK Limited'
Publication date: 13/04/2018
Field of study

Crossref

Edinburgh Research Explorer

Perspectives on instrumentation development for chemical species tomography in reactive-flow diagnosis

Author: Liu Chang
McCann Hugh
Xu Lijun
Publication venue
Publication date: 13/07/2023
Field of study

Edinburgh Research Explorer