Search CORE

65,706 research outputs found

Sign Stable Projections, Sign Cauchy Projections and Chi-Square Kernels

Author: Hopcroft John
Li Ping
Samorodnitsky Gennady
Publication venue
Publication date: 05/08/2013
Field of study

The method of stable random projections is popular for efficiently computing the Lp distances in high dimension (where 0<p<=2), using small space. Because it adopts nonadaptive linear projections, this method is naturally suitable when the data are collected in a dynamic streaming fashion (i.e., turnstile data streams). In this paper, we propose to use only the signs of the projected data and analyze the probability of collision (i.e., when the two signs differ). We derive a bound of the collision probability which is exact when p=2 and becomes less sharp when p moves away from 2. Interestingly, when p=1 (i.e., Cauchy random projections), we show that the probability of collision can be accurately approximated as functions of the chi-square similarity. For example, when the (un-normalized) data are binary, the maximum approximation error of the collision probability is smaller than 0.0192. In text and vision applications, the chi-square similarity is a popular measure for nonnegative data when the features are generated from histograms. Our experiments confirm that the proposed method is promising for large-scale learning applications

arXiv.org e-Print Archive

CiteSeerX

Sign-Full Random Projections

Author: Li Ping
Publication venue
Publication date: 26/04/2018
Field of study

The method of 1-bit ("sign-sign") random projections has been a popular tool for efficient search and machine learning on large datasets. Given two

D

-dim data vectors

u

v\in\mathbb{R}^D

, one can generate

x = \sum_{i=1}^D u_i r_i

, and

y = \sum_{i=1}^D v_i r_i

, where

r_i\sim N(0,1)

iid. The "collision probability" is

{Pr}\left(sgn(x)=sgn(y)\right) = 1-\frac{\cos^{-1}\rho}{\pi}

, where

\rho = \rho(u,v)

is the cosine similarity. We develop "sign-full" random projections by estimating

\rho

from (e.g.,) the expectation

E(sgn(x)y)=\sqrt{\frac{2}{\pi}} \rho

, which can be further substantially improved by normalizing

y

. For nonnegative data, we recommend an interesting estimator based on

E\left(y_- 1_{x\geq 0} + y_+ 1_{x<0}\right)

and its normalized version. The recommended estimator almost matches the accuracy of the (computationally expensive) maximum likelihood estimator. At high similarity (

\rho\rightarrow1

), the asymptotic variance of recommended estimator is only

\frac{4}{3\pi} \approx 0.4

of the estimator for sign-sign projections. At small

k

and high similarity, the improvement would be even much more substantial

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Optical M0bius Strips in Three Dimensional Ellipse Fields: Lines of Linear Polarization

Author: Berry
Berry
Berry
Born
Burresi
Dandliker
Engelen
Engelen
Ford
Freund
Freund
Freund
Freund
Hajnal
Hajnal
Hajnal
Isaac Freund
Kim
Lee
Nye
Nye
Nye
Nye
Rockstuhl
Tortora
Tortora
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 20/05/2009
Field of study

The minor axes of, and the normals to, the polarization ellipses that surround singular lines of linear polarization in three dimensional optical ellipse fields are shown to be organized into Mobius strips and into structures we call rippled rings (r-rings). The Mobius strips have two full twists, and can be either right- or left-handed. The major axes of the surrounding ellipses generate cone-like structures. Three orthogonal projections that give rise to 15 indices are used to characterize the different structures. These indices, if independent, could generate 839,808 geometrically and topologically distinct lines; selection rules are presented that reduce the number of lines to 8,248, some 5,562 of which have been observed in a computer simulation. Statistical probabilities are presented for the most important index combinations in random fields. It is argued that it is presently feasible to perform experimental measurements of the Mobius strips, r-rings, and cones described here theoretically

arXiv.org e-Print Archive

Crossref

Optical Mobius Strips in Three Dimensional Ellipse Fields: Lines of Circular Polarization

Author: Berry
Berry
Berry
Berry
Berry
Born
Burresi
Dandliker
Dennis
Dennis
Engelen
Engelen
Flossmann
Freund
Freund
Freund
Freund
Freund
Freund
Freund
Freund
Freund
Freund
Hajnal
Hajnal
Hajnal
Isaac Freund
Kim
Lee
Nye
Nye
Nye
Nye
Rockstuhl
Tortora
Tortora
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 17/03/2009
Field of study

The major and minor axes of the polarization ellipses that surround singular lines of circular polarization in three dimensional optical ellipse fields are shown to be organized into Mobius strips. These strips can have either one or three half-twists, and can be either right- or left-handed. The normals to the surrounding ellipses generate cone-like structures. Two special projections, one new geometrical, and seven new topological indices are developed to characterize the rather complex structures of the Mobius strips and cones. These eight indices, together with the two well-known indices used until now to characterize singular lines of circular polarization, could, if independent, generate 16,384 geometrically and topologically distinct lines. Geometric constraints and 13 selection rules are discussed that reduce the number of lines to 2,104, some 1,150 of which have been observed in practice; this number of different C lines is ~ 350 times greater than the three types of lines recognized previously. Statistical probabilities are presented for the most important index combinations in random fields. It is argued that it is presently feasible to perform experimental measurements of the Mobius strips and cones described here theoretically

arXiv.org e-Print Archive

Crossref

Sharp generalization error bounds for randomly-projected classifiers

Author: Durrant Robert J.
Kabán Ata
Publication venue: JMLR
Publication date: 01/01/2013
Field of study

We derive sharp bounds on the generalization error of a generic linear classifier trained by empirical risk minimization on randomly projected data. We make no restrictive assumptions (such as sparsity or separability) on the data: Instead we use the fact that, in a classification setting, the question of interest is really ‘what is the effect of random projection on the predicted class labels?’ and we therefore derive the exact probability of ‘label flipping’ under Gaussian random projection in order to quantify this effect precisely in our bounds

CiteSeerX

Research Commons@Waikato