782 research outputs found
Two-message quantum interactive proofs and the quantum separability problem
Suppose that a polynomial-time mixed-state quantum circuit, described as a
sequence of local unitary interactions followed by a partial trace, generates a
quantum state shared between two parties. One might then wonder, does this
quantum circuit produce a state that is separable or entangled? Here, we give
evidence that it is computationally hard to decide the answer to this question,
even if one has access to the power of quantum computation. We begin by
exhibiting a two-message quantum interactive proof system that can decide the
answer to a promise version of the question. We then prove that the promise
problem is hard for the class of promise problems with "quantum statistical
zero knowledge" (QSZK) proof systems by demonstrating a polynomial-time Karp
reduction from the QSZK-complete promise problem "quantum state
distinguishability" to our quantum separability problem. By exploiting Knill's
efficient encoding of a matrix description of a state into a description of a
circuit to generate the state, we can show that our promise problem is NP-hard
with respect to Cook reductions. Thus, the quantum separability problem (as
phrased above) constitutes the first nontrivial promise problem decidable by a
two-message quantum interactive proof system while being hard for both NP and
QSZK. We also consider a variant of the problem, in which a given
polynomial-time mixed-state quantum circuit accepts a quantum state as input,
and the question is to decide if there is an input to this circuit which makes
its output separable across some bipartite cut. We prove that this problem is a
complete promise problem for the class QIP of problems decidable by quantum
interactive proof systems. Finally, we show that a two-message quantum
interactive proof system can also decide a multipartite generalization of the
quantum separability problem.Comment: 34 pages, 6 figures; v2: technical improvements and new result for
the multipartite quantum separability problem; v3: minor changes to address
referee comments, accepted for presentation at the 2013 IEEE Conference on
Computational Complexity; v4: changed problem names; v5: updated references
and added a paragraph to the conclusion to connect with prior work on
separability testin
Structural characterizations of the navigational expressiveness of relation algebras on a tree
Given a document D in the form of an unordered node-labeled tree, we study
the expressiveness on D of various basic fragments of XPath, the core
navigational language on XML documents. Working from the perspective of these
languages as fragments of Tarski's relation algebra, we give characterizations,
in terms of the structure of D, for when a binary relation on its nodes is
definable by an expression in these algebras. Since each pair of nodes in such
a relation represents a unique path in D, our results therefore capture the
sets of paths in D definable in each of the fragments. We refer to this
perspective on language semantics as the "global view." In contrast with this
global view, there is also a "local view" where one is interested in the nodes
to which one can navigate starting from a particular node in the document. In
this view, we characterize when a set of nodes in D can be defined as the
result of applying an expression to a given node of D. All these definability
results, both in the global and the local view, are obtained by using a robust
two-step methodology, which consists of first characterizing when two nodes
cannot be distinguished by an expression in the respective fragments of XPath,
and then bootstrapping these characterizations to the desired results.Comment: 58 Page
The g-theorem and quantum information theory
We study boundary renormalization group flows between boundary conformal
field theories in dimensions using methods of quantum information theory.
We define an entropic -function for theories with impurities in terms of the
relative entanglement entropy, and we prove that this -function decreases
along boundary renormalization group flows. This entropic -theorem is valid
at zero temperature, and is independent from the -theorem based on the
thermal partition function. We also discuss the mutual information in boundary
RG flows, and how it encodes the correlations between the impurity and bulk
degrees of freedom. Our results provide a quantum-information understanding of
(boundary) RG flow as increase of distinguishability between the UV fixed point
and the theory along the RG flow.Comment: 34 pages + appendices, 8 figures. v2. Improved and corrected version
of the proo
DOCUMENT CLASSIFICATION USING MACHINE LEARNING
To perform document classification algorithmically, documents need to be represented such that it is understandable to the machine learning classifier. The report discusses the different types of feature vectors through which document can be represented and later classified. The project aims at comparing the Binary, Count and TfIdf feature vectors and their impact on document classification. To test how well each of the three mentioned feature vectors perform, we used the 20-newsgroup dataset and converted the documents to all the three feature vectors. For each feature vector representation, we trained the Naïve Bayes classifier and then tested the generated classifier on test documents. In our results, we found that TfIdf performed 4% better than Count vectorizer and 6% better than Binary vectorizer if stop words are removed. If stop words are not removed, then TfIdf performed 6% better than Binary vectorizer and 11% better than Count vectorizer. Also, Count vectorizer performs better than Binary vectorizer, if stop words are removed by 2% but lags behind by 5% if stop words are not removed. Thus, we can conclude that TfIdf should be the preferred vectorizer for document representation and classification
Community detection and stochastic block models: recent developments
The stochastic block model (SBM) is a random graph model with planted
clusters. It is widely employed as a canonical model to study clustering and
community detection, and provides generally a fertile ground to study the
statistical and computational tradeoffs that arise in network and data
sciences.
This note surveys the recent developments that establish the fundamental
limits for community detection in the SBM, both with respect to
information-theoretic and computational thresholds, and for various recovery
requirements such as exact, partial and weak recovery (a.k.a., detection). The
main results discussed are the phase transitions for exact recovery at the
Chernoff-Hellinger threshold, the phase transition for weak recovery at the
Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial
recovery, the learning of the SBM parameters and the gap between
information-theoretic and computational thresholds.
The note also covers some of the algorithms developed in the quest of
achieving the limits, in particular two-round algorithms via graph-splitting,
semi-definite programming, linearized belief propagation, classical and
nonbacktracking spectral methods. A few open problems are also discussed
Information Theoretic Methods For Biometrics, Clustering, And Stemmatology
This thesis consists of four parts, three of which study issues related to theories and applications of biometric systems, and one which focuses on clustering. We establish an information theoretic framework and the fundamental trade-off between utility of biometric systems and security of biometric systems. The utility includes person identification and secret binding, while template protection, privacy, and secrecy leakage are security issues addressed. A general model of biometric systems is proposed, in which secret binding and the use of passwords are incorporated. The system model captures major biometric system designs including biometric cryptosystems, cancelable biometrics, secret binding and secret generating systems, and salt biometric systems. In addition to attacks at the database, information leakage from communication links between sensor modules and databases is considered. A general information theoretic rate outer bound is derived for characterizing and comparing the fundamental capacity, and security risks and benefits of different system designs. We establish connections between linear codes to biometric systems, so that one can directly use a vast literature of coding theories of various noise and source random processes to achieve good performance in biometric systems. We develop two biometrics based on laser Doppler vibrometry: LDV) signals and electrocardiogram: ECG) signals. For both cases, changes in statistics of biometric traits of the same individual is the major challenge which obstructs many methods from producing satisfactory results. We propose a ii robust feature selection method that specifically accounts for changes in statistics. The method yields the best results both in LDV and ECG biometrics in terms of equal error rates in authentication scenarios. Finally, we address a different kind of learning problem from data called clustering. Instead of having a set of training data with true labels known as in identification problems, we study the problem of grouping data points without labels given, and its application to computational stemmatology. Since the problem itself has no true answer, the problem is in general ill-posed unless some regularization or norm is set to define the quality of a partition. We propose the use of minimum description length: MDL) principle for graphical based clustering. In the MDL framework, each data partitioning is viewed as a description of the data points, and the description that minimizes the total amount of bits to describe the data points and the model itself is considered the best model. We show that in synthesized data the MDL clustering works well and fits natural intuition of how data should be clustered. Furthermore, we developed a computational stemmatology method based on MDL, which achieves the best performance level in a large dataset
- …