40 research outputs found

    Highly Efficient Regression for Scalable Person Re-Identification

    Full text link
    Existing person re-identification models are poor for scaling up to large data required in real-world applications due to: (1) Complexity: They employ complex models for optimal performance resulting in high computational cost for training at a large scale; (2) Inadaptability: Once trained, they are unsuitable for incremental update to incorporate any new data available. This work proposes a truly scalable solution to re-id by addressing both problems. Specifically, a Highly Efficient Regression (HER) model is formulated by embedding the Fisher's criterion to a ridge regression model for very fast re-id model learning with scalable memory/storage usage. Importantly, this new HER model supports faster than real-time incremental model updates therefore making real-time active learning feasible in re-id with human-in-the-loop. Extensive experiments show that such a simple and fast model not only outperforms notably the state-of-the-art re-id methods, but also is more scalable to large data with additional benefits to active learning for reducing human labelling effort in re-id deployment

    Adaptive Locality Preserving Regression

    Full text link
    This paper proposes a novel discriminative regression method, called adaptive locality preserving regression (ALPR) for classification. In particular, ALPR aims to learn a more flexible and discriminative projection that not only preserves the intrinsic structure of data, but also possesses the properties of feature selection and interpretability. To this end, we introduce a target learning technique to adaptively learn a more discriminative and flexible target matrix rather than the pre-defined strict zero-one label matrix for regression. Then a locality preserving constraint regularized by the adaptive learned weights is further introduced to guide the projection learning, which is beneficial to learn a more discriminative projection and avoid overfitting. Moreover, we replace the conventional `Frobenius norm' with the special l21 norm to constrain the projection, which enables the method to adaptively select the most important features from the original high-dimensional data for feature extraction. In this way, the negative influence of the redundant features and noises residing in the original data can be greatly eliminated. Besides, the proposed method has good interpretability for features owing to the row-sparsity property of the l21 norm. Extensive experiments conducted on the synthetic database with manifold structure and many real-world databases prove the effectiveness of the proposed method.Comment: The paper has been accepted by IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), and the code can be available at https://drive.google.com/file/d/1iNzONkRByIaUhXwdEhOkkh_0d2AAXNE8/vie

    Person Re-identification in Identity Regression Space

    Get PDF
    This work was partially supported by the China Scholarship Council, Vision Semantics Ltd, Royal Society Newton Advanced Fellowship Programme (NA150459), and Innovate UK Industrial Challenge Project on Developing and Commercialising Intelligent Video Analytics Solutions for Public Safety (98111-571149)

    Sparse Modeling for Image and Vision Processing

    Get PDF
    In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

    Cross-class Transfer Learning for Visual Data

    Get PDF
    PhDAutomatic analysis of visual data is a key objective of computer vision research; and performing visual recognition of objects from images is one of the most important steps towards understanding and gaining insights into the visual data. Most existing approaches in the literature for the visual recognition are based on a supervised learning paradigm. Unfortunately, they require a large amount of labelled training data which severely limits their scalability. On the other hand, recognition is instantaneous and effortless for humans. They can recognise a new object without seeing any visual samples by just knowing the description of it, leveraging similarities between the description of the new object and previously learned concepts. Motivated by humans recognition ability, this thesis proposes novel approaches to tackle cross-class transfer learning (crossclass recognition) problem whose goal is to learn a model from seen classes (those with labelled training samples) that can generalise to unseen classes (those with labelled testing samples) without any training data i.e., seen and unseen classes are disjoint. Specifically, the thesis studies and develops new methods for addressing three variants of the cross-class transfer learning: Chapter 3 The first variant is transductive cross-class transfer learning, meaning labelled training set and unlabelled test set are available for model learning. Considering training set as the source domain and test set as the target domain, a typical cross-class transfer learning assumes that the source and target domains share a common semantic space, where visual feature vector extracted from an image can be embedded using an embedding function. Existing approaches learn this function from the source domain and apply it without adaptation to the target one. They are therefore prone to the domain shift problem i.e., the embedding function is only concerned with predicting the training seen class semantic representation in the learning stage during learning, when applied to the test data it may underperform. In this thesis, a novel cross-class transfer learning (CCTL) method is proposed based on unsupervised domain adaptation. Specifically, a novel regularised dictionary learning framework is formulated by which the target class labels are used to regularise the learned target domain embeddings thus effectively overcoming the projection domain shift problem. Chapter 4 The second variant is inductive cross-class transfer learning, that is, only training set is assumed to be available during model learning, resulting in a harder challenge compared to the previous one. Nevertheless, this setting reflects a real-world setting in which test data is available after the model learning. The main problem remains the same as the previous variant, that is, the domain shift problem occurs when the model learned only from the training set is applied to the test set without adaptation. In this thesis, a semantic autoencoder (SAE) is proposed building on an encoder-decoder paradigm. Specifically, first a semantic space is defined so that knowledge transfer is possible from the seen classes to the unseen classes. Then, an encoder aims to embed/project a visual feature vector into the semantic space. However, the decoder exerts a generative task, that is, the projection must be able to reconstruct the original visual features. The generative task forces the encoder to preserve richer information, thus the learned encoder from seen classes is able generalise better to the new unseen classes. Chapter 5 The third one is unsupervised cross-class transfer learning. In this variant, no supervision is available for model learning i.e., only unlabelled training data is available, leading to the hardest setting compared to the previous cases. The goal, however, is the same, learning some knowledge from the training data that can be transferred to the test data composed of completely different labels from that of training data. The thesis proposes a novel approach which requires no labelled training data yet is able to capture discriminative information. The proposed model is based on a new graph regularised dictionary learning algorithm. By introducing a l1- norm graph regularisation term, instead of the conventional squared l2-norm, the model is robust against outliers and noises typical in visual data. Importantly, the graph and representation are learned jointly, resulting in further alleviation of the effects of data outliers. As an application, person re-identification is considered for this variant in this thesis
    corecore