98 research outputs found
Properties of Classical and Quantum Jensen-Shannon Divergence
Jensen-Shannon divergence (JD) is a symmetrized and smoothed version of the
most important divergence measure of information theory, Kullback divergence.
As opposed to Kullback divergence it determines in a very direct way a metric;
indeed, it is the square of a metric. We consider a family of divergence
measures (JD_alpha for alpha>0), the Jensen divergences of order alpha, which
generalize JD as JD_1=JD. Using a result of Schoenberg, we prove that JD_alpha
is the square of a metric for alpha lies in the interval (0,2], and that the
resulting metric space of probability distributions can be isometrically
embedded in a real Hilbert space. Quantum Jensen-Shannon divergence (QJD) is a
symmetrized and smoothed version of quantum relative entropy and can be
extended to a family of quantum Jensen divergences of order alpha (QJD_alpha).
We strengthen results by Lamberti et al. by proving that for qubits and pure
states, QJD_alpha^1/2 is a metric space which can be isometrically embedded in
a real Hilbert space when alpha lies in the interval (0,2]. In analogy with
Burbea and Rao's generalization of JD, we also define general QJD by
associating a Jensen-type quantity to any weighted family of states.
Appropriate interpretations of quantities introduced are discussed and bounds
are derived in terms of the total variation and trace distance.Comment: 13 pages, LaTeX, expanded contents, added references and corrected
typo
Universal bounds for the Holevo quantity, coherent information \\ and the Jensen-Shannon divergence
The Holevo quantity provides an upper bound for the mutual information
between the sender of a classical message encoded in quantum carriers and the
receiver. Applying the strong sub-additivity of entropy we prove that the
Holevo quantity associated with an initial state and a given quantum operation
represented in its Kraus form is not larger than the exchange entropy. This
implies upper bounds for the coherent information and for the quantum
Jensen--Shannon divergence. Restricting our attention to classical information
we bound the transmission distance between any two probability distributions by
the entropic distance, which is a concave function of the Hellinger distance.Comment: 5 pages, 2 figure
Wootters' distance revisited: a new distinguishability criterium
The notion of distinguishability between quantum states has shown to be
fundamental in the frame of quantum information theory. In this paper we
present a new distinguishability criterium by using a information theoretic
quantity: the Jensen-Shannon divergence (JSD). This quantity has several
interesting properties, both from a conceptual and a formal point of view.
Previous to define this distinguishability criterium, we review some of the
most frequently used distances defined over quantum mechanics' Hilbert space.
In this point our main claim is that the JSD can be taken as a unifying
distance between quantum states.Comment: 15 pages, 3 figures, changed content, added reference for last
sectio
Определение количества кластеров в статистических данных
In the work three heuristic algorithms for automatic determination of the clusters amount of data samples were developed and experimentally studied. Usage of them can improve well-known clustering algorithms. Feature of the proposed algorithms is that there is no need to execute multiple calculations of clustering task for different clusters amount, with subsequent analysis of cluster structure quality.В работе разработаны и экспериментально изучены три эвристических алгоритма автоматического определения количества кластеров выборок данных, использование которых позволяет усовершенствовать известные алгоритмы кластеризации данных. Особенностью предложенных алгоритмов является отсутствие необходимости многократного решения задачи кластеризации для разного количества кластеров, с последующим анализом качества кластерной структуры
Meta-analysis of RNA-seq expression data across species, tissues and studies.
BackgroundDifferences in gene expression drive phenotypic differences between species, yet major organs and tissues generally have conserved gene expression programs. Several comparative transcriptomic studies have observed greater similarity in gene expression between homologous tissues from different vertebrate species than between diverse tissues of the same species. However, a recent study by Lin and colleagues reached the opposite conclusion. These studies differed in the species and tissues analyzed, and in technical details of library preparation, sequencing, read mapping, normalization, gene sets, and clustering methods.ResultsTo better understand gene expression evolution we reanalyzed data from four studies, including that of Lin, encompassing 6-13 tissues each from 11 vertebrate species using standardized mapping, normalization, and clustering methods. An analysis of independent data showed that the set of tissues chosen by Lin et al. were more similar to each other than those analyzed by previous studies. Comparing expression in five common tissues from the four studies, we observed that samples clustered exclusively by tissue rather than by species or study, supporting conservation of organ physiology in mammals. Furthermore, inter-study distances between homologous tissues were generally less than intra-study distances among different tissues, enabling informative meta-analyses. Notably, when comparing expression divergence of tissues over time to expression variation across 51 human GTEx tissues, we could accurately predict the clustering of expression for arbitrary pairs of tissues and species.ConclusionsThese results provide a framework for the design of future evolutionary studies of gene expression and demonstrate the utility of comparing RNA-seq data across studies
Metrics for more than two points at once
The conventional definition of a topological metric over a space specifies
properties that must be obeyed by any measure of "how separated" two points in
that space are. Here it is shown how to extend that definition, and in
particular the triangle inequality, to concern arbitrary numbers of points.
Such a measure of how separated the points within a collection are can be
bootstrapped, to measure "how separated" from each other are two (or more)
collections. The measure presented here also allows fractional membership of an
element in a collection. This means it directly concerns measures of ``how
spread out" a probability distribution over a space is. When such a measure is
bootstrapped to compare two collections, it allows us to measure how separated
two probability distributions are, or more generally, how separated a
distribution of distributions is.Comment: 8 page
- …