1,099 research outputs found

    Sparse Matrix-based Random Projection for Classification

    Full text link
    As a typical dimensionality reduction technique, random projection can be simply implemented with linear projection, while maintaining the pairwise distances of high-dimensional data with high probability. Considering this technique is mainly exploited for the task of classification, this paper is developed to study the construction of random matrix from the viewpoint of feature selection, rather than of traditional distance preservation. This yields a somewhat surprising theoretical result, that is, the sparse random matrix with exactly one nonzero element per column, can present better feature selection performance than other more dense matrices, if the projection dimension is sufficiently large (namely, not much smaller than the number of feature elements); otherwise, it will perform comparably to others. For random projection, this theoretical result implies considerable improvement on both complexity and performance, which is widely confirmed with the classification experiments on both synthetic data and real data

    Challenges of Big Data Analysis

    Full text link
    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article give overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasis on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions

    Illumination strategies for intensity-only imaging

    Full text link
    We propose a new strategy for narrow band, active array imaging of localized scat- terers when only the intensities are recorded and measured at the array. We consider a homogeneous medium so that wave propagation is fully coherent. We show that imaging with intensity-only measurements can be carried out using the time reversal operator of the imaging system, which can be obtained from intensity measurements using an appropriate illumination strategy and the polarization identity. Once the time reversal operator has been obtained, we show that the images can be formed using its singular value decomposition (SVD). We use two SVD-based methods to image the scatterers. The proposed approach is simple and efficient. It does not need prior information about the sought image, and guarantees exact recovery in the noise-free case. Furthermore, it is robust with respect to additive noise. Detailed numerical simulations illustrate the performance of the proposed imaging strategy when only the intensities are captured
    • …
    corecore