309 research outputs found

    A Comparison of Blocking Methods for Record Linkage

    Full text link
    Record linkage seeks to merge databases and to remove duplicates when unique identifiers are not available. Most approaches use blocking techniques to reduce the computational complexity associated with record linkage. We review traditional blocking techniques, which typically partition the records according to a set of field attributes, and consider two variants of a method known as locality sensitive hashing, sometimes referred to as "private blocking." We compare these approaches in terms of their recall, reduction ratio, and computational complexity. We evaluate these methods using different synthetic datafiles and conclude with a discussion of privacy-related issues.Comment: 22 pages, 2 tables, 7 figure

    Characterization of equivariant maps and application to entanglement detection

    Full text link
    We study equivariant linear maps between finite-dimensional matrix algebras, as introduced by Bhat. These maps satisfy an algebraic property which makes it easy to study their positivity or k-positivity. They are therefore particularly suitable for applications to entanglement detection in quantum information theory. We characterize their Choi matrices. In particular, we focus on a subfamily that we call (a, b)-unitarily equivariant. They can be seen as both a generalization of maps invariant under unitary conjugation as studied by Bhat and as a generalization of the equivariant maps studied by Collins et al. Using representation theory, we fully compute them and study their graphical representation, and show that they are basically enough to study all equivariant maps. We finally apply them to the problem of entanglement detection and prove that they form a sufficient (infinite) family of positive maps to detect all k-entangled density matrices.Comment: 16 pages, 4 figure

    Efficient dielectric matrix calculations using the Lanczos algorithm for fast many-body G0W0G_0W_0 implementations

    Get PDF
    We present a G0W0G_0W_0 implementation that assesses the two major bottlenecks of traditional plane-waves implementations, the summations over conduction states and the inversion of the dielectric matrix, without introducing new approximations in the formalism. The first bottleneck is circumvented by converting the summations into Sternheimer equations. Then, the novel avenue of expressing the dielectric matrix in a Lanczos basis is developed, which reduces the matrix size by orders of magnitude while being computationally efficient. We also develop a model dielectric operator that allows us to further reduce the size of the dielectric matrix without accuracy loss. Furthermore, we develop a scheme that reduces the numerical cost of the contour deformation technique to the level of the lightest plasmon pole model. Finally, the use of the simplified quasi-minimal residual scheme in replacement of the conjugate gradients algorithm allows a direct evaluation of the G0W0G_0W_0 corrections at the desired real frequencies, without need for analytical continuation. The performance of the resulting G0W0G_0W_0 implementation is demonstrated by comparison with a traditional plane-waves implementation, which reveals a 500-fold speedup for the silane molecule. Finally, the accuracy of our G0W0G_0W_0 implementation is demonstrated by comparison with other G0W0G_0W_0 calculations and experimental results.Comment: 19 pages, 2 figure

    Current Algorithms for Detecting Subgraphs of Bounded Treewidth are Probably Optimal

    Get PDF
    The Subgraph Isomorphism problem is of considerable importance in computer science. We examine the problem when the pattern graph H is of bounded treewidth, as occurs in a variety of applications. This problem has a well-known algorithm via color-coding that runs in time O(ntw(H)+1)O(n^{tw(H)+1}) [Alon, Yuster, Zwick'95], where nn is the number of vertices of the host graph GG. While there are pattern graphs known for which Subgraph Isomorphism can be solved in an improved running time of O(ntw(H)+1−Δ)O(n^{tw(H)+1-\varepsilon}) or even faster (e.g. for kk-cliques), it is not known whether such improvements are possible for all patterns. The only known lower bound rules out time no(tw(H)/log⁥(tw(H)))n^{o(tw(H) / \log(tw(H)))} for any class of patterns of unbounded treewidth assuming the Exponential Time Hypothesis [Marx'07]. In this paper, we demonstrate the existence of maximally hard pattern graphs HH that require time ntw(H)+1−o(1)n^{tw(H)+1-o(1)}. Specifically, under the Strong Exponential Time Hypothesis (SETH), a standard assumption from fine-grained complexity theory, we prove the following asymptotic statement for large treewidth tt: For any Δ>0\varepsilon > 0 there exists t≄3t \ge 3 and a pattern graph HH of treewidth tt such that Subgraph Isomorphism on pattern HH has no algorithm running in time O(nt+1−Δ)O(n^{t+1-\varepsilon}). Under the more recent 3-uniform Hyperclique hypothesis, we even obtain tight lower bounds for each specific treewidth t≄3t \ge 3: For any t≄3t \ge 3 there exists a pattern graph HH of treewidth tt such that for any Δ>0\varepsilon>0 Subgraph Isomorphism on pattern HH has no algorithm running in time O(nt+1−Δ)O(n^{t+1-\varepsilon}). In addition to these main results, we explore (1) colored and uncolored problem variants (and why they are equivalent for most cases), (2) Subgraph Isomorphism for tw<3tw < 3, (3) Subgraph Isomorphism parameterized by pathwidth, and (4) a weighted problem variant

    Current Algorithms for Detecting Subgraphs of Bounded Treewidth Are Probably Optimal

    Get PDF
    The Subgraph Isomorphism problem is of considerable importance in computer science. We examine the problem when the pattern graph H is of bounded treewidth, as occurs in a variety of applications. This problem has a well-known algorithm via color-coding that runs in time O(n^{tw(H)+1}) [Alon, Yuster, Zwick\u2795], where n is the number of vertices of the host graph G. While there are pattern graphs known for which Subgraph Isomorphism can be solved in an improved running time of O(n^{tw(H)+1-?}) or even faster (e.g. for k-cliques), it is not known whether such improvements are possible for all patterns. The only known lower bound rules out time n^{o(tw(H) / log(tw(H)))} for any class of patterns of unbounded treewidth assuming the Exponential Time Hypothesis [Marx\u2707]. In this paper, we demonstrate the existence of maximally hard pattern graphs H that require time n^{tw(H)+1-o(1)}. Specifically, under the Strong Exponential Time Hypothesis (SETH), a standard assumption from fine-grained complexity theory, we prove the following asymptotic statement for large treewidth t: For any ? > 0 there exists t ? 3 and a pattern graph H of treewidth t such that Subgraph Isomorphism on pattern H has no algorithm running in time O(n^{t+1-?}). Under the more recent 3-uniform Hyperclique hypothesis, we even obtain tight lower bounds for each specific treewidth t ? 3: For any t ? 3 there exists a pattern graph H of treewidth t such that for any ? > 0 Subgraph Isomorphism on pattern H has no algorithm running in time O(n^{t+1-?}). In addition to these main results, we explore (1) colored and uncolored problem variants (and why they are equivalent for most cases), (2) Subgraph Isomorphism for tw < 3, (3) Subgraph Isomorphism parameterized by pathwidth instead of treewidth, and (4) a weighted variant that we call Exact Weight Subgraph Isomorphism, for which we examine pseudo-polynomial time algorithms. For many of these settings we obtain similarly tight upper and lower bounds

    AMIC:An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data

    Get PDF

    Resolving the Structure of Black Holes: Philosophizing with a Hammer

    Full text link
    We give a broad conceptual review of what we have learned about black holes and their microstate structure from the study of microstate geometries and their string theory limits. We draw upon general relativity, supergravity, string theory and holographic field theory to extract universal ideas and structural features that we expect to be important in resolving the information problem and understanding the microstate structure of Schwarzschild and Kerr black holes. In particular, we emphasize two conceptually and physically distinct ideas, with different underlying energy scales: a) the transition that supports the microstate structure and prevents the formation of a horizon and b) the representation of the detailed microstate structure itself in terms of fluctuations around the transitioned state. We also show that the supergravity mechanism that supports microstate geometries becomes, in the string theory limit, either brane polarization or the excitation of non-Abelian degrees of freedom. We thus argue that if any mechanism for supporting structure at the horizon scale is to be given substance within string theory then it must be some manifestation of microstate geometries.Comment: 32 pages + reference
    • 

    corecore