1,099 research outputs found
Sparse Matrix-based Random Projection for Classification
As a typical dimensionality reduction technique, random projection can be
simply implemented with linear projection, while maintaining the pairwise
distances of high-dimensional data with high probability. Considering this
technique is mainly exploited for the task of classification, this paper is
developed to study the construction of random matrix from the viewpoint of
feature selection, rather than of traditional distance preservation. This
yields a somewhat surprising theoretical result, that is, the sparse random
matrix with exactly one nonzero element per column, can present better feature
selection performance than other more dense matrices, if the projection
dimension is sufficiently large (namely, not much smaller than the number of
feature elements); otherwise, it will perform comparably to others. For random
projection, this theoretical result implies considerable improvement on both
complexity and performance, which is widely confirmed with the classification
experiments on both synthetic data and real data
Challenges of Big Data Analysis
Big Data bring new opportunities to modern society and challenges to data
scientists. On one hand, Big Data hold great promises for discovering subtle
population patterns and heterogeneities that are not possible with small-scale
data. On the other hand, the massive sample size and high dimensionality of Big
Data introduce unique computational and statistical challenges, including
scalability and storage bottleneck, noise accumulation, spurious correlation,
incidental endogeneity, and measurement errors. These challenges are
distinguished and require new computational and statistical paradigm. This
article give overviews on the salient features of Big Data and how these
features impact on paradigm change on statistical and computational methods as
well as computing architectures. We also provide various new perspectives on
the Big Data analysis and computation. In particular, we emphasis on the
viability of the sparsest solution in high-confidence set and point out that
exogeneous assumptions in most statistical methods for Big Data can not be
validated due to incidental endogeneity. They can lead to wrong statistical
inferences and consequently wrong scientific conclusions
Illumination strategies for intensity-only imaging
We propose a new strategy for narrow band, active array imaging of localized
scat- terers when only the intensities are recorded and measured at the array.
We consider a homogeneous medium so that wave propagation is fully coherent. We
show that imaging with intensity-only measurements can be carried out using the
time reversal operator of the imaging system, which can be obtained from
intensity measurements using an appropriate illumination strategy and the
polarization identity. Once the time reversal operator has been obtained, we
show that the images can be formed using its singular value decomposition
(SVD). We use two SVD-based methods to image the scatterers. The proposed
approach is simple and efficient. It does not need prior information about the
sought image, and guarantees exact recovery in the noise-free case.
Furthermore, it is robust with respect to additive noise. Detailed numerical
simulations illustrate the performance of the proposed imaging strategy when
only the intensities are captured
- …