149,055 research outputs found

    Deterministic and Probabilistic Binary Search in Graphs

    Full text link
    We consider the following natural generalization of Binary Search: in a given undirected, positively weighted graph, one vertex is a target. The algorithm's task is to identify the target by adaptively querying vertices. In response to querying a node qq, the algorithm learns either that qq is the target, or is given an edge out of qq that lies on a shortest path from qq to the target. We study this problem in a general noisy model in which each query independently receives a correct answer with probability p>12p > \frac{1}{2} (a known constant), and an (adversarial) incorrect one with probability 1p1-p. Our main positive result is that when p=1p = 1 (i.e., all answers are correct), log2n\log_2 n queries are always sufficient. For general pp, we give an (almost information-theoretically optimal) algorithm that uses, in expectation, no more than (1δ)log2n1H(p)+o(logn)+O(log2(1/δ))(1 - \delta)\frac{\log_2 n}{1 - H(p)} + o(\log n) + O(\log^2 (1/\delta)) queries, and identifies the target correctly with probability at leas 1δ1-\delta. Here, H(p)=(plogp+(1p)log(1p))H(p) = -(p \log p + (1-p) \log(1-p)) denotes the entropy. The first bound is achieved by the algorithm that iteratively queries a 1-median of the nodes not ruled out yet; the second bound by careful repeated invocations of a multiplicative weights algorithm. Even for p=1p = 1, we show several hardness results for the problem of determining whether a target can be found using KK queries. Our upper bound of log2n\log_2 n implies a quasipolynomial-time algorithm for undirected connected graphs; we show that this is best-possible under the Strong Exponential Time Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs with non-uniform node querying costs, the problem is PSPACE-complete. For a semi-adaptive version, in which one may query rr nodes each in kk rounds, we show membership in Σ2k1\Sigma_{2k-1} in the polynomial hierarchy, and hardness for Σ2k5\Sigma_{2k-5}

    Binary search in graphs revisited

    Get PDF
    In the classical binary search in a path the aim is to detect an unknown target by asking as few queries as possible, where each query reveals the direction to the target. This binary search algorithm has been recently extended by Emamjomeh-Zadeh et al. (in: Proceedings of the 48th annual ACM SIGACT symposium on theory of computing, STOC 2016, Cambridge, pp. 519–532, 2016) to the problem of detecting a target in an arbitrary graph. Similarly to the classical case in the path, the algorithm of Emamjomeh-Zadeh et al. maintains a candidates’ set for the target, while each query asks an appropriately chosen vertex—the “median”—which minimises a potential Φ among the vertices of the candidates’ set. In this paper we address three open questions posed by Emamjomeh-Zadeh et al., namely (a) detecting a target when the query response is a direction to an approximately shortest path to the target, (b) detecting a target when querying a vertex that is an approximate median of the current candidates’ set (instead of an exact one), and (c) detecting multiple targets, for which to the best of our knowledge no progress has been made so far. We resolve questions (a) and (b) by providing appropriate upper and lower bounds, as well as a new potential Γ that guarantees efficient target detection even by querying an approximate median each time. With respect to (c), we initiate a systematic study for detecting two targets in graphs and we identify sufficient conditions on the queries that allow for strong (linear) lower bounds and strong (polylogarithmic) upper bounds for the number of queries. All of our positive results can be derived using our new potential Γ that allows querying approximate medians

    Competitive Online Search Trees on Trees

    Full text link
    We consider the design of adaptive data structures for searching elements of a tree-structured space. We use a natural generalization of the rotation-based online binary search tree model in which the underlying search space is the set of vertices of a tree. This model is based on a simple structure for decomposing graphs, previously known under several names including elimination trees, vertex rankings, and tubings. The model is equivalent to the classical binary search tree model exactly when the underlying tree is a path. We describe an online O(loglogn)O(\log \log n)-competitive search tree data structure in this model, matching the best known competitive ratio of binary search trees. Our method is inspired by Tango trees, an online binary search tree algorithm, but critically needs several new notions including one which we call Steiner-closed search trees, which may be of independent interest. Moreover our technique is based on a novel use of two levels of decomposition, first from search space to a set of Steiner-closed trees, and secondly from these trees into paths

    FASER: Binary Code Similarity Search through the use of Intermediate Representations

    Full text link
    Being able to identify functions of interest in cross-architecture software is useful whether you are analysing for malware, securing the software supply chain or conducting vulnerability research. Cross-Architecture Binary Code Similarity Search has been explored in numerous studies and has used a wide range of different data sources to achieve its goals. The data sources typically used draw on common structures derived from binaries such as function control flow graphs or binary level call graphs, the output of the disassembly process or the outputs of a dynamic analysis approach. One data source which has received less attention is binary intermediate representations. Binary Intermediate representations possess two interesting properties: they are cross architecture by their very nature and encode the semantics of a function explicitly to support downstream usage. Within this paper we propose Function as a String Encoded Representation (FASER) which combines long document transformers with the use of intermediate representations to create a model capable of cross architecture function search without the need for manual feature engineering, pre-training or a dynamic analysis step. We compare our approach against a series of baseline approaches for two tasks; A general function search task and a targeted vulnerability search task. Our approach demonstrates strong performance across both tasks, performing better than all baseline approaches.Comment: 10 pages, Proceedings of the Conference on Applied Machine Learning in Information Security (CAMLIS

    SAFE: Self-Attentive Function Embeddings for Binary Similarity

    Get PDF
    The binary similarity problem consists in determining if two functions are similar by only considering their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc., and thus have an immediate practical impact. Current solutions compare functions by first transforming their binary code in multi-dimensional vector representations (embeddings), and then comparing vectors through simple and efficient geometric operations. However, embeddings are usually derived from binary code using manual feature extraction, that may fail in considering important function characteristics, or may consider features that are not important for the binary similarity problem. In this paper we propose SAFE, a novel architecture for the embedding of functions based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions (i.e., it does not incur in the computational overhead of building or manipulating control flow graphs), and is more general as it works on stripped binaries and on multiple architectures. We report the results from a quantitative and qualitative analysis that show how SAFE provides a noticeable performance improvement with respect to previous solutions. Furthermore, we show how clusters of our embedding vectors are closely related to the semantic of the implemented algorithms, paving the way for further interesting applications (e.g. semantic-based binary function search).Comment: Published in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA) 201

    CloudTree: A Library to Extend Cloud Services for Trees

    Full text link
    In this work, we propose a library that enables on a cloud the creation and management of tree data structures from a cloud client. As a proof of concept, we implement a new cloud service CloudTree. With CloudTree, users are able to organize big data into tree data structures of their choice that are physically stored in a cloud. We use caching, prefetching, and aggregation techniques in the design and implementation of CloudTree to enhance performance. We have implemented the services of Binary Search Trees (BST) and Prefix Trees as current members in CloudTree and have benchmarked their performance using the Amazon Cloud. The idea and techniques in the design and implementation of a BST and prefix tree is generic and thus can also be used for other types of trees such as B-tree, and other link-based data structures such as linked lists and graphs. Preliminary experimental results show that CloudTree is useful and efficient for various big data applications

    Sub-graph based joint sparse graph for sparse code multiple access systems

    Get PDF
    Sparse code multiple access (SCMA) is a promising air interface candidate technique for next generation mobile networks, especially for massive machine type communications (mMTC). In this paper, we design a LDPC coded SCMA detector by combining the sparse graphs of LDPC and SCMA into one joint sparse graph (JSG). In our proposed scheme, SCMA sparse graph (SSG) defined by small size indicator matrix is utilized to construct the JSG, which is termed as sub-graph based joint sparse graph of SCMA (SG-JSG-SCMA). In this paper, we first study the binary-LDPC (B-LDPC) coded SGJSG- SCMA system. To combine the SCMA variable node (SVN) and LDPC variable node (LVN) into one joint variable node (JVN), a non-binary LDPC (NB-LDPC) coded SG-JSG-SCMA is also proposed. Furthermore, to reduce the complexity of NBLDPC coded SG-JSG-SCMA, a joint trellis representation (JTR) is introduced to represent the search space of NB-LDPC coded SG-JSG-SCMA. Based on JTR, a low complexity joint trellis based detection and decoding (JTDD) algorithm is proposed to reduce the computational complexity of NB-LDPC coded SGJSG- SCMA system. According to the simulation results, SG-JSGSCMA brings significant performance improvement compare to the conventional receiver using the disjoint approach, and it can also outperform a Turbo-structured receiver with comparable complexity. Moreover, the joint approach also has advantages in terms of processing latency compare to the Turbo approaches
    corecore