35,951 research outputs found
Mining data streams using option trees (revised edition, 2004)
The data stream model for data mining places harsh restrictions on a learning algorithm. A model must be induced following the briefest interrogation of the data, must use only available memory and must update itself over time within these constraints. Additionally, the model must be able to be used for data mining at any point in time.
This paper describes a data stream classi_cation algorithm using an ensemble of option trees. The ensemble of trees is induced by boosting and iteratively combined into a single interpretable model. The algorithm is evaluated using benchmark datasets for accuracy against state-of-the-art algorithms that make use of the entire dataset
Recommended from our members
Offline algorithms for dynamic minimum spanning tree problems
We describe an efficient algorithm for maintaining a minimum spanning tree (MST) in a graph subject to a sequence of edge weight modifications. The sequence of minimum spanning trees is computed offline, after the sequence of modifications is known. The algorithm performs (log n) work per modification, where n is the number of vertices in the graph. We use our techniques to solve the offline geometric MST problem for a planar point set subject to insertions and deletions; our algorithm for this problem performs O(log^2 n) work per modification. No previous dynamic geometric MST algorithm was known
Persisting randomness in randomly growing discrete structures: graphs and search trees
The successive discrete structures generated by a sequential algorithm from
random input constitute a Markov chain that may exhibit long term dependence on
its first few input values. Using examples from random graph theory and search
algorithms we show how such persistence of randomness can be detected and
quantified with techniques from discrete potential theory. We also show that
this approach can be used to obtain strong limit theorems in cases where
previously only distributional convergence was known.Comment: Official journal fil
Finiteness theorems in stochastic integer programming
We study Graver test sets for families of linear multi-stage stochastic
integer programs with varying number of scenarios. We show that these test sets
can be decomposed into finitely many ``building blocks'', independent of the
number of scenarios, and we give an effective procedure to compute these
building blocks. The paper includes an introduction to Nash-Williams' theory of
better-quasi-orderings, which is used to show termination of our algorithm. We
also apply this theory to finiteness results for Hilbert functions.Comment: 36 p
On Verifying Complex Properties using Symbolic Shape Analysis
One of the main challenges in the verification of software systems is the
analysis of unbounded data structures with dynamic memory allocation, such as
linked data structures and arrays. We describe Bohne, a new analysis for
verifying data structures. Bohne verifies data structure operations and shows
that 1) the operations preserve data structure invariants and 2) the operations
satisfy their specifications expressed in terms of changes to the set of
objects stored in the data structure. During the analysis, Bohne infers loop
invariants in the form of disjunctions of universally quantified Boolean
combinations of formulas. To synthesize loop invariants of this form, Bohne
uses a combination of decision procedures for Monadic Second-Order Logic over
trees, SMT-LIB decision procedures (currently CVC Lite), and an automated
reasoner within the Isabelle interactive theorem prover. This architecture
shows that synthesized loop invariants can serve as a useful communication
mechanism between different decision procedures. Using Bohne, we have verified
operations on data structures such as linked lists with iterators and back
pointers, trees with and without parent pointers, two-level skip lists, array
data structures, and sorted lists. We have deployed Bohne in the Hob and Jahob
data structure analysis systems, enabling us to combine Bohne with analyses of
data structure clients and apply it in the context of larger programs. This
report describes the Bohne algorithm as well as techniques that Bohne uses to
reduce the ammount of annotations and the running time of the analysis
Algorithms for Combinatorial Systems: Well-Founded Systems and Newton Iterations
We consider systems of recursively defined combinatorial structures. We give
algorithms checking that these systems are well founded, computing generating
series and providing numerical values. Our framework is an articulation of the
constructible classes of Flajolet and Sedgewick with Joyal's species theory. We
extend the implicit species theorem to structures of size zero. A quadratic
iterative Newton method is shown to solve well-founded systems combinatorially.
From there, truncations of the corresponding generating series are obtained in
quasi-optimal complexity. This iteration transfers to a numerical scheme that
converges unconditionally to the values of the generating series inside their
disk of convergence. These results provide important subroutines in random
generation. Finally, the approach is extended to combinatorial differential
systems.Comment: 61 page
- …