158,119 research outputs found

    Formal Verification of Input-Output Mappings of Tree Ensembles

    Full text link
    Recent advances in machine learning and artificial intelligence are now being considered in safety-critical autonomous systems where software defects may cause severe harm to humans and the environment. Design organizations in these domains are currently unable to provide convincing arguments that their systems are safe to operate when machine learning algorithms are used to implement their software. In this paper, we present an efficient method to extract equivalence classes from decision trees and tree ensembles, and to formally verify that their input-output mappings comply with requirements. The idea is that, given that safety requirements can be traced to desirable properties on system input-output patterns, we can use positive verification outcomes in safety arguments. This paper presents the implementation of the method in the tool VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case studies presented in current literature. We demonstrate that our method is practical for tree ensembles trained on low-dimensional data with up to 25 decision trees and tree depths of up to 20. Our work also studies the limitations of the method with high-dimensional data and preliminarily investigates the trade-off between large number of trees and time taken for verification

    Online Algorithms for Multi-Level Aggregation

    Full text link
    In the Multi-Level Aggregation Problem (MLAP), requests arrive at the nodes of an edge-weighted tree T, and have to be served eventually. A service is defined as a subtree X of T that contains its root. This subtree X serves all requests that are pending in the nodes of X, and the cost of this service is equal to the total weight of X. Each request also incurs waiting cost between its arrival and service times. The objective is to minimize the total waiting cost of all requests plus the total cost of all service subtrees. MLAP is a generalization of some well-studied optimization problems; for example, for trees of depth 1, MLAP is equivalent to the TCP Acknowledgment Problem, while for trees of depth 2, it is equivalent to the Joint Replenishment Problem. Aggregation problem for trees of arbitrary depth arise in multicasting, sensor networks, communication in organization hierarchies, and in supply-chain management. The instances of MLAP associated with these applications are naturally online, in the sense that aggregation decisions need to be made without information about future requests. Constant-competitive online algorithms are known for MLAP with one or two levels. However, it has been open whether there exist constant competitive online algorithms for trees of depth more than 2. Addressing this open problem, we give the first constant competitive online algorithm for networks of arbitrary (fixed) number of levels. The competitive ratio is O(D^4 2^D), where D is the depth of T. The algorithm works for arbitrary waiting cost functions, including the variant with deadlines. We also show several additional lower and upper bound results for some special cases of MLAP, including the Single-Phase variant and the case when the tree is a path

    Planting trees in graphs, and finding them back

    Full text link
    In this paper we study detection and reconstruction of planted structures in Erd\H{o}s-R\'enyi random graphs. Motivated by a problem of communication security, we focus on planted structures that consist in a tree graph. For planted line graphs, we establish the following phase diagram. In a low density region where the average degree λ\lambda of the initial graph is below some critical value λc=1\lambda_c=1, detection and reconstruction go from impossible to easy as the line length KK crosses some critical value f(λ)ln(n)f(\lambda)\ln(n), where nn is the number of nodes in the graph. In the high density region λ>λc\lambda>\lambda_c, detection goes from impossible to easy as KK goes from o(n)o(\sqrt{n}) to ω(n)\omega(\sqrt{n}), and reconstruction remains impossible so long as K=o(n)K=o(n). For DD-ary trees of varying depth hh and 2DO(1)2\le D\le O(1), we identify a low-density region λ<λD\lambda<\lambda_D, such that the following holds. There is a threshold h=g(D)ln(ln(n))h*=g(D)\ln(\ln(n)) with the following properties. Detection goes from feasible to impossible as hh crosses hh*. We also show that only partial reconstruction is feasible at best for hhh\ge h*. We conjecture a similar picture to hold for DD-ary trees as for lines in the high-density region λ>λD\lambda>\lambda_D, but confirm only the following part of this picture: Detection is easy for DD-ary trees of size ω(n)\omega(\sqrt{n}), while at best only partial reconstruction is feasible for DD-ary trees of any size o(n)o(n). These results are in contrast with the corresponding picture for detection and reconstruction of {\em low rank} planted structures, such as dense subgraphs and block communities: We observe a discrepancy between detection and reconstruction, the latter being impossible for a wide range of parameters where detection is easy. This property does not hold for previously studied low rank planted structures

    A Unified approach to concurrent and parallel algorithms on balanced data structures

    Get PDF
    Concurrent and parallel algorithms are different. However, in the case of dictionaries, both kinds of algorithms share many common points. We present a unified approach emphasizing these points. It is based on a careful analysis of the sequential algorithm, extracting from it the more basic facts, encapsulated later on as local rules. We apply the method to the insertion algorithms in AVL trees. All the concurrent and parallel insertion algorithms have two main phases. A percolation phase, moving the keys to be inserted down, and a rebalancing phase. Finally, some other algorithms and balanced structures are discussed.Postprint (published version
    corecore