149,055 research outputs found
Deterministic and Probabilistic Binary Search in Graphs
We consider the following natural generalization of Binary Search: in a given
undirected, positively weighted graph, one vertex is a target. The algorithm's
task is to identify the target by adaptively querying vertices. In response to
querying a node , the algorithm learns either that is the target, or is
given an edge out of that lies on a shortest path from to the target.
We study this problem in a general noisy model in which each query
independently receives a correct answer with probability (a
known constant), and an (adversarial) incorrect one with probability .
Our main positive result is that when (i.e., all answers are
correct), queries are always sufficient. For general , we give an
(almost information-theoretically optimal) algorithm that uses, in expectation,
no more than queries, and identifies the target correctly with probability at
leas . Here, denotes the
entropy. The first bound is achieved by the algorithm that iteratively queries
a 1-median of the nodes not ruled out yet; the second bound by careful repeated
invocations of a multiplicative weights algorithm.
Even for , we show several hardness results for the problem of
determining whether a target can be found using queries. Our upper bound of
implies a quasipolynomial-time algorithm for undirected connected
graphs; we show that this is best-possible under the Strong Exponential Time
Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs
with non-uniform node querying costs, the problem is PSPACE-complete. For a
semi-adaptive version, in which one may query nodes each in rounds, we
show membership in in the polynomial hierarchy, and hardness
for
Binary search in graphs revisited
In the classical binary search in a path the aim is to detect an unknown target by asking as few queries as possible, where each query reveals the direction to the target. This binary search algorithm has been recently extended by Emamjomeh-Zadeh et al. (in: Proceedings of the 48th annual ACM SIGACT symposium on theory of computing, STOC 2016, Cambridge, pp. 519–532, 2016) to the problem of detecting a target in an arbitrary graph. Similarly to the classical case in the path, the algorithm of Emamjomeh-Zadeh et al. maintains a candidates’ set for the target, while each query asks an appropriately chosen vertex—the “median”—which minimises a potential Φ among the vertices of the candidates’ set. In this paper we address three open questions posed by Emamjomeh-Zadeh et al., namely (a) detecting a target when the query response is a direction to an approximately shortest path to the target, (b) detecting a target when querying a vertex that is an approximate median of the current candidates’ set (instead of an exact one), and (c) detecting multiple targets, for which to the best of our knowledge no progress has been made so far. We resolve questions (a) and (b) by providing appropriate upper and lower bounds, as well as a new potential Γ that guarantees efficient target detection even by querying an approximate median each time. With respect to (c), we initiate a systematic study for detecting two targets in graphs and we identify sufficient conditions on the queries that allow for strong (linear) lower bounds and strong (polylogarithmic) upper bounds for the number of queries. All of our positive results can be derived using our new potential Γ that allows querying approximate medians
Competitive Online Search Trees on Trees
We consider the design of adaptive data structures for searching elements of
a tree-structured space. We use a natural generalization of the rotation-based
online binary search tree model in which the underlying search space is the set
of vertices of a tree. This model is based on a simple structure for
decomposing graphs, previously known under several names including elimination
trees, vertex rankings, and tubings. The model is equivalent to the classical
binary search tree model exactly when the underlying tree is a path. We
describe an online -competitive search tree data structure in
this model, matching the best known competitive ratio of binary search trees.
Our method is inspired by Tango trees, an online binary search tree algorithm,
but critically needs several new notions including one which we call
Steiner-closed search trees, which may be of independent interest. Moreover our
technique is based on a novel use of two levels of decomposition, first from
search space to a set of Steiner-closed trees, and secondly from these trees
into paths
FASER: Binary Code Similarity Search through the use of Intermediate Representations
Being able to identify functions of interest in cross-architecture software
is useful whether you are analysing for malware, securing the software supply
chain or conducting vulnerability research. Cross-Architecture Binary Code
Similarity Search has been explored in numerous studies and has used a wide
range of different data sources to achieve its goals. The data sources
typically used draw on common structures derived from binaries such as function
control flow graphs or binary level call graphs, the output of the disassembly
process or the outputs of a dynamic analysis approach. One data source which
has received less attention is binary intermediate representations. Binary
Intermediate representations possess two interesting properties: they are cross
architecture by their very nature and encode the semantics of a function
explicitly to support downstream usage. Within this paper we propose Function
as a String Encoded Representation (FASER) which combines long document
transformers with the use of intermediate representations to create a model
capable of cross architecture function search without the need for manual
feature engineering, pre-training or a dynamic analysis step. We compare our
approach against a series of baseline approaches for two tasks; A general
function search task and a targeted vulnerability search task. Our approach
demonstrates strong performance across both tasks, performing better than all
baseline approaches.Comment: 10 pages, Proceedings of the Conference on Applied Machine Learning
in Information Security (CAMLIS
SAFE: Self-Attentive Function Embeddings for Binary Similarity
The binary similarity problem consists in determining if two functions are
similar by only considering their compiled form. Advanced techniques for binary
similarity recently gained momentum as they can be applied in several fields,
such as copyright disputes, malware analysis, vulnerability detection, etc.,
and thus have an immediate practical impact. Current solutions compare
functions by first transforming their binary code in multi-dimensional vector
representations (embeddings), and then comparing vectors through simple and
efficient geometric operations. However, embeddings are usually derived from
binary code using manual feature extraction, that may fail in considering
important function characteristics, or may consider features that are not
important for the binary similarity problem. In this paper we propose SAFE, a
novel architecture for the embedding of functions based on a self-attentive
neural network. SAFE works directly on disassembled binary functions, does not
require manual feature extraction, is computationally more efficient than
existing solutions (i.e., it does not incur in the computational overhead of
building or manipulating control flow graphs), and is more general as it works
on stripped binaries and on multiple architectures. We report the results from
a quantitative and qualitative analysis that show how SAFE provides a
noticeable performance improvement with respect to previous solutions.
Furthermore, we show how clusters of our embedding vectors are closely related
to the semantic of the implemented algorithms, paving the way for further
interesting applications (e.g. semantic-based binary function search).Comment: Published in International Conference on Detection of Intrusions and
Malware, and Vulnerability Assessment (DIMVA) 201
CloudTree: A Library to Extend Cloud Services for Trees
In this work, we propose a library that enables on a cloud the creation and
management of tree data structures from a cloud client. As a proof of concept,
we implement a new cloud service CloudTree. With CloudTree, users are able to
organize big data into tree data structures of their choice that are physically
stored in a cloud. We use caching, prefetching, and aggregation techniques in
the design and implementation of CloudTree to enhance performance. We have
implemented the services of Binary Search Trees (BST) and Prefix Trees as
current members in CloudTree and have benchmarked their performance using the
Amazon Cloud. The idea and techniques in the design and implementation of a BST
and prefix tree is generic and thus can also be used for other types of trees
such as B-tree, and other link-based data structures such as linked lists and
graphs. Preliminary experimental results show that CloudTree is useful and
efficient for various big data applications
Sub-graph based joint sparse graph for sparse code multiple access systems
Sparse code multiple access (SCMA) is a promising air interface candidate technique for next generation mobile networks, especially for massive machine type communications (mMTC). In this paper, we design a LDPC coded SCMA detector by combining the sparse graphs of LDPC and SCMA into one joint sparse graph (JSG). In our proposed scheme, SCMA sparse graph (SSG) defined by small size indicator matrix is utilized to construct the JSG, which is termed as sub-graph based joint sparse graph of SCMA (SG-JSG-SCMA). In this paper, we first study the binary-LDPC (B-LDPC) coded SGJSG- SCMA system. To combine the SCMA variable node (SVN) and LDPC variable node (LVN) into one joint variable node (JVN), a non-binary LDPC (NB-LDPC) coded SG-JSG-SCMA is also proposed. Furthermore, to reduce the complexity of NBLDPC coded SG-JSG-SCMA, a joint trellis representation (JTR) is introduced to represent the search space of NB-LDPC coded SG-JSG-SCMA. Based on JTR, a low complexity joint trellis based detection and decoding (JTDD) algorithm is proposed to reduce the computational complexity of NB-LDPC coded SGJSG- SCMA system. According to the simulation results, SG-JSGSCMA brings significant performance improvement compare to the conventional receiver using the disjoint approach, and it can also outperform a Turbo-structured receiver with comparable complexity. Moreover, the joint approach also has advantages in terms of processing latency compare to the Turbo approaches
- …