Search CORE

543 research outputs found

Visualising the structure of document search results: A comparison of graph theoretic approaches

Author: Busing F.
Chen C.
Coxon A.
Leuski A.
Salton G.
Skupin A.
Timothy Cribbin
Van Rijsbergen C.J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/04/2009
Field of study

This is the post-print of the article - Copyright @ 2010 Sage PublicationsPrevious work has shown that distance-similarity visualisation or ‘spatialisation’ can provide a potentially useful context in which to browse the results of a query search, enabling the user to adopt a simple local foraging or ‘cluster growing’ strategy to navigate through the retrieved document set. However, faithfully mapping feature-space models to visual space can be problematic owing to their inherent high dimensionality and non-linearity. Conventional linear approaches to dimension reduction tend to fail at this kind of task, sacrificing local structural in order to preserve a globally optimal mapping. In this paper the clustering performance of a recently proposed algorithm called isometric feature mapping (Isomap), which deals with non-linearity by transforming dissimilarities into geodesic distances, is compared to that of non-metric multidimensional scaling (MDS). Various graph pruning methods, for geodesic distance estimation, are also compared. Results show that Isomap is significantly better at preserving local structural detail than MDS, suggesting it is better suited to cluster growing and other semantic navigation tasks. Moreover, it is shown that applying a minimum-cost graph pruning criterion can provide a parameter-free alternative to the traditional K-neighbour method, resulting in spatial clustering that is equivalent to or better than that achieved using an optimal-K criterion

Crossref

Brunel University Research Archive

AutoPruner: Transformer-Based Call Graph Pruning

Author: Haryono Stefanus Agus
Kang Hong Jin
Le Xuan-Bach D.
Le-Cong Thanh
Lo David
Nguyen Truong Giang
Thang Huynh Quyet
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/09/2022
Field of study

Constructing a static call graph requires trade-offs between soundness and precision. Program analysis techniques for constructing call graphs are unfortunately usually imprecise. To address this problem, researchers have recently proposed call graph pruning empowered by machine learning to post-process call graphs constructed by static analysis. A machine learning model is built to capture information from the call graph by extracting structural features for use in a random forest classifier. It then removes edges that are predicted to be false positives. Despite the improvements shown by machine learning models, they are still limited as they do not consider the source code semantics and thus often are not able to effectively distinguish true and false positives. In this paper, we present a novel call graph pruning technique, AutoPruner, for eliminating false positives in call graphs via both statistical semantic and structural analysis. Given a call graph constructed by traditional static analysis tools, AutoPruner takes a Transformer-based approach to capture the semantic relationships between the caller and callee functions associated with each edge in the call graph. To do so, AutoPruner fine-tunes a model of code that was pre-trained on a large corpus to represent source code based on descriptions of its semantics. Next, the model is used to extract semantic features from the functions related to each edge in the call graph. AutoPruner uses these semantic features together with the structural features extracted from the call graph to classify each edge via a feed-forward neural network. Our empirical evaluation on a benchmark dataset of real-world programs shows that AutoPruner outperforms the state-of-the-art baselines, improving on F-measure by up to 13% in identifying false-positive edges in a static call graph.Comment: Accepted to ESEC/FSE 2022, Research Trac

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Rate adaptive binary erasure quantization with dual fountain codes

Author: Doufexi A
Ismail MR
Piechocki RJ
Sejdinovic D
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

In this contribution, duals of fountain codes are introduced and their use for lossy source compression is investigated. It is shown both theoretically and experimentally that the source coding dual of the binary erasure channel coding problem, binary erasure quantization, is solved at a nearly optimal rate with application of duals of LT and raptor codes by a belief propagation-like algorithm which amounts to a graph pruning procedure. Furthermore, this quantizing scheme is rate adaptive, i.e., its rate can be modified on-the-fly in order to adapt to the source distribution, very much like LT and raptor codes are able to adapt their rate to the erasure probability of a channel

Crossref

UCL Discovery

Oxford University Research Archive

Explore Bristol Research

The $z$ -matching problem on bipartite graphs

Author: Zhao Jin-Hua
Publication venue
Publication date: 09/12/2018
Field of study

The

z

-matching problem on bipartite graphs is studied with a local algorithm. A

z

-matching (

z \ge 1

) on a bipartite graph is a set of matched edges, in which each vertex of one type is adjacent to at most

1

matched edge and each vertex of the other type is adjacent to at most

z

matched edges. The

z

-matching problem on a given bipartite graph concerns finding

z

-matchings with the maximum size. Our approach to this combinatorial optimization are of two folds. From an algorithmic perspective, we adopt a local algorithm as a linear approximate solver to find

z

-matchings on general bipartite graphs, whose basic component is a generalized version of the greedy leaf removal procedure in graph theory. From an analytical perspective, in the case of random bipartite graphs with the same size of two types of vertices, we develop a mean-field theory for the percolation phenomenon underlying the local algorithm, leading to a theoretical estimation of

z

-matching sizes on coreless graphs. We hope that our results can shed light on further study on algorithms and computational complexity of the optimization problem.Comment: 15 pages, 3 figure

arXiv.org e-Print Archive

Informed RRT*: Optimal Sampling-based Path Planning Focused via Direct Sampling of an Admissible Ellipsoidal Heuristic

Author: Barfoot Timothy D.
Gammell Jonathan D.
Srinivasa Siddhartha S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/11/2014
Field of study

Rapidly-exploring random trees (RRTs) are popular in motion planning because they find solutions efficiently to single-query problems. Optimal RRTs (RRT*s) extend RRTs to the problem of finding the optimal solution, but in doing so asymptotically find the optimal path from the initial state to every state in the planning domain. This behaviour is not only inefficient but also inconsistent with their single-query nature. For problems seeking to minimize path length, the subset of states that can improve a solution can be described by a prolate hyperspheroid. We show that unless this subset is sampled directly, the probability of improving a solution becomes arbitrarily small in large worlds or high state dimensions. In this paper, we present an exact method to focus the search by directly sampling this subset. The advantages of the presented sampling technique are demonstrated with a new algorithm, Informed RRT*. This method retains the same probabilistic guarantees on completeness and optimality as RRT* while improving the convergence rate and final solution quality. We present the algorithm as a simple modification to RRT* that could be further extended by more advanced path-planning algorithms. We show experimentally that it outperforms RRT* in rate of convergence, final solution cost, and ability to find difficult passages while demonstrating less dependence on the state dimension and range of the planning problem.Comment: 8 pages, 11 figures. Videos available at https://www.youtube.com/watch?v=d7dX5MvDYTc and https://www.youtube.com/watch?v=nsl-5MZfwu

arXiv.org e-Print Archive

Crossref