4,837 research outputs found
Tight conditions for consistency of variable selection in the context of high dimensionality
We address the issue of variable selection in the regression model with very
high ambient dimension, that is, when the number of variables is very large.
The main focus is on the situation where the number of relevant variables,
called intrinsic dimension, is much smaller than the ambient dimension d.
Without assuming any parametric form of the underlying regression function, we
get tight conditions making it possible to consistently estimate the set of
relevant variables. These conditions relate the intrinsic dimension to the
ambient dimension and to the sample size. The procedure that is provably
consistent under these tight conditions is based on comparing quadratic
functionals of the empirical Fourier coefficients with appropriately chosen
threshold values. The asymptotic analysis reveals the presence of two quite
different re gimes. The first regime is when the intrinsic dimension is fixed.
In this case the situation in nonparametric regression is the same as in linear
regression, that is, consistent variable selection is possible if and only if
log d is small compared to the sample size n. The picture is different in the
second regime, that is, when the number of relevant variables denoted by s
tends to infinity as . Then we prove that consistent variable
selection in nonparametric set-up is possible only if s+loglog d is small
compared to log n. We apply these results to derive minimax separation rates
for the problem of variableComment: arXiv admin note: text overlap with arXiv:1102.3616; Published in at
http://dx.doi.org/10.1214/12-AOS1046 the Annals of Statistics
(http://www.imstat.org/aos/) by the Institute of Mathematical Statistics
(http://www.imstat.org
Sub-linear Upper Bounds on Fourier dimension of Boolean Functions in terms of Fourier sparsity
We prove that the Fourier dimension of any Boolean function with Fourier
sparsity is at most . Our proof method yields an
improved bound of assuming a conjecture of
Tsang~\etal~\cite{tsang}, that for every Boolean function of sparsity there
is an affine subspace of of co-dimension O(\poly\log s)
restricted to which the function is constant. This conjectured bound is tight
upto poly-logarithmic factors as the Fourier dimension and sparsity of the
address function are quadratically separated. We obtain these bounds by
observing that the Fourier dimension of a Boolean function is equivalent to its
non-adaptive parity decision tree complexity, and then bounding the latter
Fourier sparsity, spectral norm, and the Log-rank conjecture
We study Boolean functions with sparse Fourier coefficients or small spectral
norm, and show their applications to the Log-rank Conjecture for XOR functions
f(x\oplus y) --- a fairly large class of functions including well studied ones
such as Equality and Hamming Distance. The rank of the communication matrix M_f
for such functions is exactly the Fourier sparsity of f. Let d be the F2-degree
of f and D^CC(f) stand for the deterministic communication complexity for
f(x\oplus y). We show that 1. D^CC(f) = O(2^{d^2/2} log^{d-2} ||\hat f||_1). In
particular, the Log-rank conjecture holds for XOR functions with constant
F2-degree. 2. D^CC(f) = O(d ||\hat f||_1) = O(\sqrt{rank(M_f)}\logrank(M_f)).
We obtain our results through a degree-reduction protocol based on a variant of
polynomial rank, and actually conjecture that its communication cost is already
\log^{O(1)}rank(M_f). The above bounds also hold for the parity decision tree
complexity of f, a measure that is no less than the communication complexity
(up to a factor of 2).
Along the way we also show several structural results about Boolean functions
with small F2-degree or small spectral norm, which could be of independent
interest. For functions f with constant F2-degree: 1) f can be written as the
summation of quasi-polynomially many indicator functions of subspaces with
\pm-signs, improving the previous doubly exponential upper bound by Green and
Sanders; 2) being sparse in Fourier domain is polynomially equivalent to having
a small parity decision tree complexity; 3) f depends only on polylog||\hat
f||_1 linear functions of input variables. For functions f with small spectral
norm: 1) there is an affine subspace with co-dimension O(||\hat f||_1) on which
f is a constant; 2) there is a parity decision tree with depth O(||\hat f||_1
log ||\hat f||_0).Comment: v2: Corollary 31 of v1 removed because of a bug in the proof. (Other
results not affected.
Multimodal Multipart Learning for Action Recognition in Depth Videos
The articulated and complex nature of human actions makes the task of action
recognition difficult. One approach to handle this complexity is dividing it to
the kinetics of body parts and analyzing the actions based on these partial
descriptors. We propose a joint sparse regression based learning method which
utilizes the structured sparsity to model each action as a combination of
multimodal features from a sparse set of body parts. To represent dynamics and
appearance of parts, we employ a heterogeneous set of depth and skeleton based
features. The proper structure of multimodal multipart features are formulated
into the learning framework via the proposed hierarchical mixed norm, to
regularize the structured features of each part and to apply sparsity between
them, in favor of a group feature selection. Our experimental results expose
the effectiveness of the proposed learning method in which it outperforms other
methods in all three tested datasets while saturating one of them by achieving
perfect accuracy
- …