Search CORE

13,336 research outputs found

Relative Information Loss in the PCA

Author: Geiger Bernhard C.
Kubin Gernot
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/07/2012
Field of study

In this work we analyze principle component analysis (PCA) as a deterministic input-output system. We show that the relative information loss induced by reducing the dimensionality of the data after performing the PCA is the same as in dimensionality reduction without PCA. Finally, we analyze the case where the PCA uses the sample covariance matrix to compute the rotation. If the rotation matrix is not available at the output, we show that an infinite amount of information is lost. The relative information loss is shown to decrease with increasing sample size.Comment: 9 pages, 4 figure; extended version of a paper accepted for publicatio

arXiv.org e-Print Archive

Crossref

Greedy Algorithms for Optimal Distribution Approximation

Author: Böcherer Georg
Geiger Bernhard C.
Publication venue: 'MDPI AG'
Publication date: 22/01/2016
Field of study

The approximation of a discrete probability distribution

\mathbf{t}

by an

M

-type distribution

\mathbf{p}

is considered. The approximation error is measured by the informational divergence

\mathbb{D}(\mathbf{t}\Vert\mathbf{p})

, which is an appropriate measure, e.g., in the context of data compression. Properties of the optimal approximation are derived and bounds on the approximation error are presented, which are asymptotically tight. It is shown that

M

-type approximations that minimize either

\mathbb{D}(\mathbf{t}\Vert\mathbf{p})

, or

\mathbb{D}(\mathbf{p}\Vert\mathbf{t})

, or the variational distance

\Vert\mathbf{p}-\mathbf{t}\Vert_1

can all be found by using specific instances of the same general greedy algorithm.Comment: 5 page

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Information-Preserving Markov Aggregation

Author: Geiger Bernhard C.
Temmel Christoph
Publication venue
Publication date: 01/01/2013
Field of study

We present a sufficient condition for a non-injective function of a Markov chain to be a second-order Markov chain with the same entropy rate as the original chain. This permits an information-preserving state space reduction by merging states or, equivalently, lossless compression of a Markov source on a sample-by-sample basis. The cardinality of the reduced state space is bounded from below by the node degrees of the transition graph associated with the original Markov chain. We also present an algorithm listing all possible information-preserving state space reductions, for a given transition graph. We illustrate our results by applying the algorithm to a bi-gram letter model of an English text.Comment: 7 pages, 3 figures, 2 table

arXiv.org e-Print Archive

VU Research Portal