207 research outputs found

    Compressed learning

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.There has been an explosion of data derived from the internet and other digital sources. These data are usually multi-dimensional, massive in volume, frequently incomplete, noisy, and complicated in structure. These "big data" bring new challenges to machine learning (ML), which has historically been designed for small volumes of clearly defined and structured data. In this thesis we propose new methods of "compressed learning", which explore the components and procedures in ML methods that are compressible, in order to improve their robustness, scalability, adaptivity, and performance for big data analysis. We will study novel methodologies that compress different components throughout the learning process, propose more interpretable general compressible structures for big data, and develop effective strategies to leverage these compressible structures to produce highly scalable learning algorithms. We present several new insights into popular learning problems in the context of compressed learning. The theoretical analyses are tested on real data in order to demonstrate the efficacy and efficiency of the methodologies in real-world scenarios. In particular, we propose "manifold elastic net (MEN)" and "double shrinking (DS)" as two fast frameworks extracting low-dimensional sparse features for dimension reduction and manifold learning. These methods compress the features on both their dimension and cardinality, and significantly improve their interpretation and performance in clustering and classification tasks. We study how to derive fewer "anchor points" for representing large datasets in their entirety by proposing "divide-and-conquer anchoring", in which the global solution is rapidly found for near-separable non-negative matrix factorization and completion in a distributed manner. This method represents a compression of the big data itself, rather than features, and the extracted anchors define the structure of the data. Two fast low-rank approximation methods, "bilateral random projections (BRP)" of fast computer closed-form and "greedy bilateral sketch (GreBske)", are proposed based on random projection and greedy augmenting update rules. They can be broadly applied to learning procedures that requires updates of a low-rank matrix variable and result in significant acceleration in performance. We study how to compress noisy data for learning by decomposing it into the sum mixture of low-rank part and sparse part. "GO decomposition (GoDec)" and the "greedy bilateral (GreB)" paradigm are proposed as two efficient approaches to this problem based on randomized and greedy strategies, respectively. Modifications of these two schemes result in novel models and extremely fast algorithms for matrix completion that aim to recover a low-rank matrix from a small number of its entries. In addition, we extend the GoDec problem in order to unmix more than two incoherent structures that are more complicated and expressive than low-rank or sparse matrices. The three proposed variants are not only novel and effective algorithms for motion segmentation in computer vision, multi-label learning, and scoring-function learning in recommendation systems, but also reveal new theoretical insights into these problems. Finally, a compressed learning method termed “compressed labelling (CL) on distilled label sets (DL)" is proposed for solving the three core problems in multi-label learning, namely high-dimensional labels, label correlation modeling, and sample imbalance for each label. By compressing the labels and the number of classifiers in multi-label learning, CL can generate an effective and efficient training algorithm from any single-label classifier

    The Incremental Multiresolution Matrix Factorization Algorithm

    Full text link
    Multiresolution analysis and matrix factorization are foundational tools in computer vision. In this work, we study the interface between these two distinct topics and obtain techniques to uncover hierarchical block structure in symmetric matrices -- an important aspect in the success of many vision problems. Our new algorithm, the incremental multiresolution matrix factorization, uncovers such structure one feature at a time, and hence scales well to large matrices. We describe how this multiscale analysis goes much farther than what a direct global factorization of the data can identify. We evaluate the efficacy of the resulting factorizations for relative leveraging within regression tasks using medical imaging data. We also use the factorization on representations learned by popular deep networks, providing evidence of their ability to infer semantic relationships even when they are not explicitly trained to do so. We show that this algorithm can be used as an exploratory tool to improve the network architecture, and within numerous other settings in vision.Comment: Computer Vision and Pattern Recognition (CVPR) 2017, 10 page

    Flexible unsupervised feature extraction for image classification

    Get PDF
    Dimensionality reduction is one of the fundamental and important topics in the fields of pattern recognition and machine learning. However, most existing dimensionality reduction methods aim to seek a projection matrix W such that the projection W T x is exactly equal to the true low-dimensional representation. In practice, this constraint is too rigid to well capture the geometric structure of data. To tackle this problem, we relax this constraint but use an elastic one on the projection with the aim to reveal the geometric structure of data. Based on this context, we propose an unsupervised dimensionality reduction model named flexible unsupervised feature extraction (FUFE) for image classification. Moreover, we theoretically prove that PCA and LPP, which are two of the most representative unsupervised dimensionality reduction models, are special cases of FUFE, and propose a non-iterative algorithm to solve it. Experiments on five real-world image databases show the effectiveness of the proposed model

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Learning midlevel image features for natural scene and texture classification

    Get PDF
    This paper deals with coding of natural scenes in order to extract semantic information. We present a new scheme to project natural scenes onto a basis in which each dimension encodes statistically independent information. Basis extraction is performed by independent component analysis (ICA) applied to image patches culled from natural scenes. The study of the resulting coding units (coding filters) extracted from well-chosen categories of images shows that they adapt and respond selectively to discriminant features in natural scenes. Given this basis, we define global and local image signatures relying on the maximal activity of filters on the input image. Locally, the construction of the signature takes into account the spatial distribution of the maximal responses within the image. We propose a criterion to reduce the size of the space of representation for faster computation. The proposed approach is tested in the context of texture classification (111 classes), as well as natural scenes classification (11 categories, 2037 images). Using a common protocol, the other commonly used descriptors have at most 47.7% accuracy on average while our method obtains performances of up to 63.8%. We show that this advantage does not depend on the size of the signature and demonstrate the efficiency of the proposed criterion to select ICA filters and reduce the dimensio

    Non-Decimated Wavelet based Multi-Band Ear Recognition using Principal Component Analysis

    Get PDF
    Principal Component Analysis (PCA) has been successfully applied to many applications, including ear recognition. This paper presents a 2D Wavelet based Multi-Band Principal Component Analysis (2D-WMBPCA) ear recognition method, inspired by PCA based techniques for multispectral and hyperspectral images. The proposed 2D-WMBPCA method performs a 2D non-decimated wavelet transform on the input image, dividing it into its wavelet subbands. Each resulting subband is then divided into a number of frames based on its coefficient’s values. The multi frame generation boundaries are calculated using either equal size or greedy hill climbing techniques. Conventional PCA is applied on each subband’s resulting frames, yielding its eigenvectors, which are used for matching. The intersection of the energy of the eigenvectors and the total number of features for each subband shows the number of bands which yield the highest matching performance. Experimental results on the images of two benchmark ear datasets, called IITD II and USTB I, demonstrated that the proposed 2D-WMBPCA technique significantly outperforms Single Image PCA by up to 56.79% and the eigenfaces technique by up to 20.37% with respect to matching accuracy. Furthermore, the proposed technique achieves very competitive results to those of learning based techniques at a fraction of their computational time and without needing to be trained
    • 

    corecore