9,190 research outputs found
A Survey of Parallel Data Mining
With the fast, continuous increase in the number and size of databases, parallel data mining is a natural and cost-effective approach to tackle the problem of scalability in data mining. Recently there has been a considerable research on parallel data mining. However, most projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This paper surveys parallel data mining with a broader perspective. More precisely, we discuss the parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule induction, instance-based learning, genetic algorithms and neural networks. Using the lessons
learned from this discussion, we also derive a set of heuristic principles for designing efficient parallel data mining algorithms
On Algorithms and Complexity for Sets with Cardinality Constraints
Typestate systems ensure many desirable properties of imperative programs,
including initialization of object fields and correct use of stateful library
interfaces. Abstract sets with cardinality constraints naturally generalize
typestate properties: relationships between the typestates of objects can be
expressed as subset and disjointness relations on sets, and elements of sets
can be represented as sets of cardinality one. Motivated by these applications,
this paper presents new algorithms and new complexity results for constraints
on sets and their cardinalities. We study several classes of constraints and
demonstrate a trade-off between their expressive power and their complexity.
Our first result concerns a quantifier-free fragment of Boolean Algebra with
Presburger Arithmetic. We give a nondeterministic polynomial-time algorithm for
reducing the satisfiability of sets with symbolic cardinalities to constraints
on constant cardinalities, and give a polynomial-space algorithm for the
resulting problem.
In a quest for more efficient fragments, we identify several subclasses of
sets with cardinality constraints whose satisfiability is NP-hard. Finally, we
identify a class of constraints that has polynomial-time satisfiability and
entailment problems and can serve as a foundation for efficient program
analysis.Comment: 20 pages. 12 figure
Probability around the Quantum Gravity. Part 1: Pure Planar Gravity
In this paper we study stochastic dynamics which leaves quantum gravity
equilibrium distribution invariant. We start theoretical study of this dynamics
(earlier it was only used for Monte-Carlo simulation). Main new results concern
the existence and properties of local correlation functions in the
thermodynamic limit. The study of dynamics constitutes a third part of the
series of papers where more general class of processes were studied (but it is
self-contained), those processes have some universal significance in
probability and they cover most concrete processes, also they have many
examples in computer science and biology. At the same time the paper can serve
an introduction to quantum gravity for a probabilist: we give a rigorous
exposition of quantum gravity in the planar pure gravity case. Mostly we use
combinatorial techniques, instead of more popular in physics random matrix
models, the central point is the famous exponent.Comment: 40 pages, 11 figure
Abelian networks IV. Dynamics of nonhalting networks
An abelian network is a collection of communicating automata whose state
transitions and message passing each satisfy a local commutativity condition.
This paper is a continuation of the abelian networks series of Bond and Levine
(2016), for which we extend the theory of abelian networks that halt on all
inputs to networks that can run forever. A nonhalting abelian network can be
realized as a discrete dynamical system in many different ways, depending on
the update order. We show that certain features of the dynamics, such as
minimal period length, have intrinsic definitions that do not require
specifying an update order.
We give an intrinsic definition of the \emph{torsion group} of a finite
irreducible (halting or nonhalting) abelian network, and show that it coincides
with the critical group of Bond and Levine (2016) if the network is halting. We
show that the torsion group acts freely on the set of invertible recurrent
components of the trajectory digraph, and identify when this action is
transitive.
This perspective leads to new results even in the classical case of sinkless
rotor networks (deterministic analogues of random walks). In Holroyd et. al
(2008) it was shown that the recurrent configurations of a sinkless rotor
network with just one chip are precisely the unicycles (spanning subgraphs with
a unique oriented cycle, with the chip on the cycle). We generalize this result
to abelian mobile agent networks with any number of chips. We give formulas for
generating series such as where is the number of recurrent chip-and-rotor configurations with
chips; is the diagonal matrix of outdegrees, and is the adjacency
matrix. A consequence is that the sequence completely
determines the spectrum of the simple random walk on the network.Comment: 95 pages, 21 figure
Random Matrices with Slow Correlation Decay
We consider large random matrices with a general slowly decaying correlation
among its entries. We prove universality of the local eigenvalue statistics and
optimal local laws for the resolvent away from the spectral edges, generalizing
the recent result of [arXiv:1604.08188] to allow slow correlation decay and
arbitrary expectation. The main novel tool is a systematic diagrammatic control
of a multivariate cumulant expansion.Comment: 41 pages, 1 figure. We corrected a typo in (4.1b
- …