158,119 research outputs found
Formal Verification of Input-Output Mappings of Tree Ensembles
Recent advances in machine learning and artificial intelligence are now being
considered in safety-critical autonomous systems where software defects may
cause severe harm to humans and the environment. Design organizations in these
domains are currently unable to provide convincing arguments that their systems
are safe to operate when machine learning algorithms are used to implement
their software.
In this paper, we present an efficient method to extract equivalence classes
from decision trees and tree ensembles, and to formally verify that their
input-output mappings comply with requirements. The idea is that, given that
safety requirements can be traced to desirable properties on system
input-output patterns, we can use positive verification outcomes in safety
arguments. This paper presents the implementation of the method in the tool
VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case
studies presented in current literature.
We demonstrate that our method is practical for tree ensembles trained on
low-dimensional data with up to 25 decision trees and tree depths of up to 20.
Our work also studies the limitations of the method with high-dimensional data
and preliminarily investigates the trade-off between large number of trees and
time taken for verification
Online Algorithms for Multi-Level Aggregation
In the Multi-Level Aggregation Problem (MLAP), requests arrive at the nodes
of an edge-weighted tree T, and have to be served eventually. A service is
defined as a subtree X of T that contains its root. This subtree X serves all
requests that are pending in the nodes of X, and the cost of this service is
equal to the total weight of X. Each request also incurs waiting cost between
its arrival and service times. The objective is to minimize the total waiting
cost of all requests plus the total cost of all service subtrees. MLAP is a
generalization of some well-studied optimization problems; for example, for
trees of depth 1, MLAP is equivalent to the TCP Acknowledgment Problem, while
for trees of depth 2, it is equivalent to the Joint Replenishment Problem.
Aggregation problem for trees of arbitrary depth arise in multicasting, sensor
networks, communication in organization hierarchies, and in supply-chain
management. The instances of MLAP associated with these applications are
naturally online, in the sense that aggregation decisions need to be made
without information about future requests.
Constant-competitive online algorithms are known for MLAP with one or two
levels. However, it has been open whether there exist constant competitive
online algorithms for trees of depth more than 2. Addressing this open problem,
we give the first constant competitive online algorithm for networks of
arbitrary (fixed) number of levels. The competitive ratio is O(D^4 2^D), where
D is the depth of T. The algorithm works for arbitrary waiting cost functions,
including the variant with deadlines.
We also show several additional lower and upper bound results for some
special cases of MLAP, including the Single-Phase variant and the case when the
tree is a path
Planting trees in graphs, and finding them back
In this paper we study detection and reconstruction of planted structures in
Erd\H{o}s-R\'enyi random graphs. Motivated by a problem of communication
security, we focus on planted structures that consist in a tree graph. For
planted line graphs, we establish the following phase diagram. In a low density
region where the average degree of the initial graph is below some
critical value , detection and reconstruction go from impossible
to easy as the line length crosses some critical value ,
where is the number of nodes in the graph. In the high density region
, detection goes from impossible to easy as goes from
to , and reconstruction remains impossible so
long as . For -ary trees of varying depth and ,
we identify a low-density region , such that the following
holds. There is a threshold with the following properties.
Detection goes from feasible to impossible as crosses . We also show
that only partial reconstruction is feasible at best for . We
conjecture a similar picture to hold for -ary trees as for lines in the
high-density region , but confirm only the following part of
this picture: Detection is easy for -ary trees of size ,
while at best only partial reconstruction is feasible for -ary trees of any
size . These results are in contrast with the corresponding picture for
detection and reconstruction of {\em low rank} planted structures, such as
dense subgraphs and block communities: We observe a discrepancy between
detection and reconstruction, the latter being impossible for a wide range of
parameters where detection is easy. This property does not hold for previously
studied low rank planted structures
A Unified approach to concurrent and parallel algorithms on balanced data structures
Concurrent and parallel algorithms are different. However, in the case of dictionaries, both kinds of algorithms share many
common points. We present a unified approach emphasizing these points. It is based on a careful analysis of the sequential
algorithm, extracting from it the more basic facts, encapsulated later on as local rules. We apply the method to the
insertion algorithms in AVL trees. All the concurrent and parallel insertion algorithms have two main phases. A
percolation phase, moving the keys to be inserted down, and a rebalancing phase. Finally, some other algorithms and
balanced structures are discussed.Postprint (published version
- …