67,412 research outputs found
The Complexity of Identifying Large Equivalence Classes
We prove that at least (3k−4) / k(2k−3) n(n-1)/2 − O(k) equivalence tests and nomore than 2/k n(n-1)/2 + O(n)equivalence tests are needed in the worst case to identify the equivalence classes with at least k members in set of n elements. The upper bound is an improvement by a factor 2 compared to known results. For k = 3 we give tighter bounds. Finally, for k > n/2 we prove that it is necessary and it suffices to make 2n − k − 1 equivalence tests which generalizes a known result for k = [(n+1)/2]
Efficient Discovery of Ontology Functional Dependencies
Poor data quality has become a pervasive issue due to the increasing
complexity and size of modern datasets. Constraint based data cleaning
techniques rely on integrity constraints as a benchmark to identify and correct
errors. Data values that do not satisfy the given set of constraints are
flagged as dirty, and data updates are made to re-align the data and the
constraints. However, many errors often require user input to resolve due to
domain expertise defining specific terminology and relationships. For example,
in pharmaceuticals, 'Advil' \emph{is-a} brand name for 'ibuprofen' that can be
captured in a pharmaceutical ontology. While functional dependencies (FDs) have
traditionally been used in existing data cleaning solutions to model syntactic
equivalence, they are not able to model broader relationships (e.g., is-a)
defined by an ontology. In this paper, we take a first step towards extending
the set of data quality constraints used in data cleaning by defining and
discovering \emph{Ontology Functional Dependencies} (OFDs). We lay out
theoretical and practical foundations for OFDs, including a set of sound and
complete axioms, and a linear inference procedure. We then develop effective
algorithms for discovering OFDs, and a set of optimizations that efficiently
prune the search space. Our experimental evaluation using real data show the
scalability and accuracy of our algorithms.Comment: 12 page
Enumeration of non-orientable 3-manifolds using face pairing graphs and union-find
Drawing together techniques from combinatorics and computer science, we
improve the census algorithm for enumerating closed minimal P^2-irreducible
3-manifold triangulations. In particular, new constraints are proven for face
pairing graphs, and pruning techniques are improved using a modification of the
union-find algorithm. Using these results we catalogue all 136 closed
non-orientable P^2-irreducible 3-manifolds that can be formed from at most ten
tetrahedra.Comment: 37 pages, 34 figure
The geometry of quantum learning
Concept learning provides a natural framework in which to place the problems
solved by the quantum algorithms of Bernstein-Vazirani and Grover. By combining
the tools used in these algorithms--quantum fast transforms and amplitude
amplification--with a novel (in this context) tool--a solution method for
geometrical optimization problems--we derive a general technique for quantum
concept learning. We name this technique "Amplified Impatient Learning" and
apply it to construct quantum algorithms solving two new problems: BATTLESHIP
and MAJORITY, more efficiently than is possible classically.Comment: 20 pages, plain TeX with amssym.tex, related work at
http://www.math.uga.edu/~hunziker/ and http://math.ucsd.edu/~dmeyer
Local Causal States and Discrete Coherent Structures
Coherent structures form spontaneously in nonlinear spatiotemporal systems
and are found at all spatial scales in natural phenomena from laboratory
hydrodynamic flows and chemical reactions to ocean, atmosphere, and planetary
climate dynamics. Phenomenologically, they appear as key components that
organize the macroscopic behaviors in such systems. Despite a century of
effort, they have eluded rigorous analysis and empirical prediction, with
progress being made only recently. As a step in this, we present a formal
theory of coherent structures in fully-discrete dynamical field theories. It
builds on the notion of structure introduced by computational mechanics,
generalizing it to a local spatiotemporal setting. The analysis' main tool
employs the \localstates, which are used to uncover a system's hidden
spatiotemporal symmetries and which identify coherent structures as
spatially-localized deviations from those symmetries. The approach is
behavior-driven in the sense that it does not rely on directly analyzing
spatiotemporal equations of motion, rather it considers only the spatiotemporal
fields a system generates. As such, it offers an unsupervised approach to
discover and describe coherent structures. We illustrate the approach by
analyzing coherent structures generated by elementary cellular automata,
comparing the results with an earlier, dynamic-invariant-set approach that
decomposes fields into domains, particles, and particle interactions.Comment: 27 pages, 10 figures;
http://csc.ucdavis.edu/~cmg/compmech/pubs/dcs.ht
- …