309 research outputs found
A Comparison of Blocking Methods for Record Linkage
Record linkage seeks to merge databases and to remove duplicates when unique
identifiers are not available. Most approaches use blocking techniques to
reduce the computational complexity associated with record linkage. We review
traditional blocking techniques, which typically partition the records
according to a set of field attributes, and consider two variants of a method
known as locality sensitive hashing, sometimes referred to as "private
blocking." We compare these approaches in terms of their recall, reduction
ratio, and computational complexity. We evaluate these methods using different
synthetic datafiles and conclude with a discussion of privacy-related issues.Comment: 22 pages, 2 tables, 7 figure
Characterization of equivariant maps and application to entanglement detection
We study equivariant linear maps between finite-dimensional matrix algebras,
as introduced by Bhat. These maps satisfy an algebraic property which makes it
easy to study their positivity or k-positivity. They are therefore particularly
suitable for applications to entanglement detection in quantum information
theory. We characterize their Choi matrices. In particular, we focus on a
subfamily that we call (a, b)-unitarily equivariant. They can be seen as both a
generalization of maps invariant under unitary conjugation as studied by Bhat
and as a generalization of the equivariant maps studied by Collins et al. Using
representation theory, we fully compute them and study their graphical
representation, and show that they are basically enough to study all
equivariant maps. We finally apply them to the problem of entanglement
detection and prove that they form a sufficient (infinite) family of positive
maps to detect all k-entangled density matrices.Comment: 16 pages, 4 figure
Efficient dielectric matrix calculations using the Lanczos algorithm for fast many-body implementations
We present a implementation that assesses the two major bottlenecks
of traditional plane-waves implementations, the summations over conduction
states and the inversion of the dielectric matrix, without introducing new
approximations in the formalism. The first bottleneck is circumvented by
converting the summations into Sternheimer equations. Then, the novel avenue of
expressing the dielectric matrix in a Lanczos basis is developed, which reduces
the matrix size by orders of magnitude while being computationally efficient.
We also develop a model dielectric operator that allows us to further reduce
the size of the dielectric matrix without accuracy loss. Furthermore, we
develop a scheme that reduces the numerical cost of the contour deformation
technique to the level of the lightest plasmon pole model. Finally, the use of
the simplified quasi-minimal residual scheme in replacement of the conjugate
gradients algorithm allows a direct evaluation of the corrections at
the desired real frequencies, without need for analytical continuation. The
performance of the resulting implementation is demonstrated by
comparison with a traditional plane-waves implementation, which reveals a
500-fold speedup for the silane molecule. Finally, the accuracy of our
implementation is demonstrated by comparison with other calculations
and experimental results.Comment: 19 pages, 2 figure
Current Algorithms for Detecting Subgraphs of Bounded Treewidth are Probably Optimal
The Subgraph Isomorphism problem is of considerable importance in computer science. We examine the problem when the pattern graph H is of bounded treewidth, as occurs in a variety of applications. This problem has a well-known algorithm via color-coding that runs in time [Alon, Yuster, Zwick'95], where is the number of vertices of the host graph . While there are pattern graphs known for which Subgraph Isomorphism can be solved in an improved running time of or even faster (e.g. for -cliques), it is not known whether such improvements are possible for all patterns. The only known lower bound rules out time for any class of patterns of unbounded treewidth assuming the Exponential Time Hypothesis [Marx'07]. In this paper, we demonstrate the existence of maximally hard pattern graphs that require time . Specifically, under the Strong Exponential Time Hypothesis (SETH), a standard assumption from fine-grained complexity theory, we prove the following asymptotic statement for large treewidth : For any there exists and a pattern graph of treewidth such that Subgraph Isomorphism on pattern has no algorithm running in time . Under the more recent 3-uniform Hyperclique hypothesis, we even obtain tight lower bounds for each specific treewidth : For any there exists a pattern graph of treewidth such that for any Subgraph Isomorphism on pattern has no algorithm running in time . In addition to these main results, we explore (1) colored and uncolored problem variants (and why they are equivalent for most cases), (2) Subgraph Isomorphism for , (3) Subgraph Isomorphism parameterized by pathwidth, and (4) a weighted problem variant
Current Algorithms for Detecting Subgraphs of Bounded Treewidth Are Probably Optimal
The Subgraph Isomorphism problem is of considerable importance in computer science. We examine the problem when the pattern graph H is of bounded treewidth, as occurs in a variety of applications. This problem has a well-known algorithm via color-coding that runs in time O(n^{tw(H)+1}) [Alon, Yuster, Zwick\u2795], where n is the number of vertices of the host graph G. While there are pattern graphs known for which Subgraph Isomorphism can be solved in an improved running time of O(n^{tw(H)+1-?}) or even faster (e.g. for k-cliques), it is not known whether such improvements are possible for all patterns. The only known lower bound rules out time n^{o(tw(H) / log(tw(H)))} for any class of patterns of unbounded treewidth assuming the Exponential Time Hypothesis [Marx\u2707].
In this paper, we demonstrate the existence of maximally hard pattern graphs H that require time n^{tw(H)+1-o(1)}. Specifically, under the Strong Exponential Time Hypothesis (SETH), a standard assumption from fine-grained complexity theory, we prove the following asymptotic statement for large treewidth t:
For any ? > 0 there exists t ? 3 and a pattern graph H of treewidth t such that Subgraph Isomorphism on pattern H has no algorithm running in time O(n^{t+1-?}).
Under the more recent 3-uniform Hyperclique hypothesis, we even obtain tight lower bounds for each specific treewidth t ? 3:
For any t ? 3 there exists a pattern graph H of treewidth t such that for any ? > 0 Subgraph Isomorphism on pattern H has no algorithm running in time O(n^{t+1-?}).
In addition to these main results, we explore (1) colored and uncolored problem variants (and why they are equivalent for most cases), (2) Subgraph Isomorphism for tw < 3, (3) Subgraph Isomorphism parameterized by pathwidth instead of treewidth, and (4) a weighted variant that we call Exact Weight Subgraph Isomorphism, for which we examine pseudo-polynomial time algorithms. For many of these settings we obtain similarly tight upper and lower bounds
Resolving the Structure of Black Holes: Philosophizing with a Hammer
We give a broad conceptual review of what we have learned about black holes
and their microstate structure from the study of microstate geometries and
their string theory limits. We draw upon general relativity, supergravity,
string theory and holographic field theory to extract universal ideas and
structural features that we expect to be important in resolving the information
problem and understanding the microstate structure of Schwarzschild and Kerr
black holes. In particular, we emphasize two conceptually and physically
distinct ideas, with different underlying energy scales: a) the transition that
supports the microstate structure and prevents the formation of a horizon and
b) the representation of the detailed microstate structure itself in terms of
fluctuations around the transitioned state. We also show that the supergravity
mechanism that supports microstate geometries becomes, in the string theory
limit, either brane polarization or the excitation of non-Abelian degrees of
freedom. We thus argue that if any mechanism for supporting structure at the
horizon scale is to be given substance within string theory then it must be
some manifestation of microstate geometries.Comment: 32 pages + reference
- âŠ