Search CORE

190 research outputs found

Group Membership Prediction

Author: Chen Yuting
Saligrama Venkatesh
Zhang Ziming
Publication venue
Publication date: 15/09/2015
Field of study

The group membership prediction (GMP) problem involves predicting whether or not a collection of instances share a certain semantic property. For instance, in kinship verification given a collection of images, the goal is to predict whether or not they share a {\it familial} relationship. In this context we propose a novel probability model and introduce latent {\em view-specific} and {\em view-shared} random variables to jointly account for the view-specific appearance and cross-view similarities among data instances. Our model posits that data from each view is independent conditioned on the shared variables. This postulate leads to a parametric probability model that decomposes group membership likelihood into a tensor product of data-independent parameters and data-dependent factors. We propose learning the data-independent parameters in a discriminative way with bilinear classifiers, and test our prediction algorithm on challenging visual recognition tasks such as multi-camera person re-identification and kinship verification. On most benchmark datasets, our method can significantly outperform the current state-of-the-art.Comment: accepted for ICCV 201

arXiv.org e-Print Archive

Crossref

Improving CNN-based Person Re-identification using score Normalization

Author: Atalla Shadi
Benzaibak Afaf
Boudellal Chahrazed
Chouchane Ammar
Himeur Yassine
Mansoor Wathiq
Ouamane Abdelmalik
Publication venue
Publication date: 04/07/2023
Field of study

Person re-identification (PRe-ID) is a crucial task in security, surveillance, and retail analysis, which involves identifying an individual across multiple cameras and views. However, it is a challenging task due to changes in illumination, background, and viewpoint. Efficient feature extraction and metric learning algorithms are essential for a successful PRe-ID system. This paper proposes a novel approach for PRe-ID, which combines a Convolutional Neural Network (CNN) based feature extraction method with Cross-view Quadratic Discriminant Analysis (XQDA) for metric learning. Additionally, a matching algorithm that employs Mahalanobis distance and a score normalization process to address inconsistencies between camera scores is implemented. The proposed approach is tested on four challenging datasets, including VIPeR, GRID, CUHK01, and PRID450S, and promising results are obtained. For example, without normalization, the rank-20 rate accuracies of the GRID, CUHK01, VIPeR and PRID450S datasets were 61.92%, 83.90%, 92.03%, 96.22%; however, after score normalization, they have increased to 64.64%, 89.30%, 92.78%, and 98.76%, respectively. Accordingly, the promising results on four challenging datasets indicate the effectiveness of the proposed approach.Comment: 5 pages, 6 figures and 2 table

arXiv.org e-Print Archive

The Role of Riemannian Manifolds in Computer Vision: From Coding to Deep Metric Learning

Author: Faraki Masoud
Publication venue
Publication date: 01/01/2018
Field of study

A diverse number of tasks in computer vision and machine learning enjoy from representations of data that are compact yet discriminative, informative and robust to critical measurements. Two notable representations are offered by Region Covariance Descriptors (RCovD) and linear subspaces which are naturally analyzed through the manifold of Symmetric Positive Definite (SPD) matrices and the Grassmann manifold, respectively, two widely used types of Riemannian manifolds in computer vision. As our first objective, we examine image and video-based recognition applications where the local descriptors have the aforementioned Riemannian structures, namely the SPD or linear subspace structure. Initially, we provide a solution to compute Riemannian version of the conventional Vector of Locally aggregated Descriptors (VLAD), using geodesic distance of the underlying manifold as the nearness measure. Next, by having a closer look at the resulting codes, we formulate a new concept which we name Local Difference Vectors (LDV). LDVs enable us to elegantly expand our Riemannian coding techniques to any arbitrary metric as well as provide intrinsic solutions to Riemannian sparse coding and its variants when local structured descriptors are considered. We then turn our attention to two special types of covariance descriptors namely infinite-dimensional RCovDs and rank-deficient covariance matrices for which the underlying Riemannian structure, i.e. the manifold of SPD matrices is out of reach to great extent. %Generally speaking, infinite-dimensional RCovDs offer better discriminatory power over their low-dimensional counterparts. To overcome this difficulty, we propose to approximate the infinite-dimensional RCovDs by making use of two feature mappings, namely random Fourier features and the Nystrom method. As for the rank-deficient covariance matrices, unlike most existing approaches that employ inference tools by predefined regularizers, we derive positive definite kernels that can be decomposed into the kernels on the cone of SPD matrices and kernels on the Grassmann manifolds and show their effectiveness for image set classification task. Furthermore, inspired by attractive properties of Riemannian optimization techniques, we extend the recently introduced Keep It Simple and Straightforward MEtric learning (KISSME) method to the scenarios where input data is non-linearly distributed. To this end, we make use of the infinite dimensional covariance matrices and propose techniques towards projecting on the positive cone in a Reproducing Kernel Hilbert Space (RKHS). We also address the sensitivity issue of the KISSME to the input dimensionality. The KISSME algorithm is greatly dependent on Principal Component Analysis (PCA) as a preprocessing step which can lead to difficulties, especially when the dimensionality is not meticulously set. To address this issue, based on the KISSME algorithm, we develop a Riemannian framework to jointly learn a mapping performing dimensionality reduction and a metric in the induced space. Lastly, in line with the recent trend in metric learning, we devise end-to-end learning of a generic deep network for metric learning using our derivation

The Australian National University

Similarity learning for person re-identification and semantic video retrieval

Author: Chen Yuting
Publication venue
Publication date: 10/07/2017
Field of study

Many computer vision problems boil down to the learning of a good visual similarity function that calculates a score of how likely two instances share the same semantic concept. In this thesis, we focus on two problems related to similarity learning: Person Re-Identification, and Semantic Video Retrieval. Person Re-Identification aims to maintain the identity of an individual in diverse locations through different non-overlapping camera views. Starting with two cameras, we propose a novel visual word co-occurrence based appearance model to measure the similarities between pedestrian images. This model naturally accounts for spatial similarities and variations caused by pose, illumination and configuration changes across camera views. As a generalization to multiple camera views, we introduce the Group Membership Prediction (GMP) problem. The GMP problem involves predicting whether a collection of instances shares the same semantic property. In this context, we propose a novel probability model and introduce latent view-specific and view-shared random variables to jointly account for the view-specific appearance and cross-view similarities among data instances. Our method is tested on various benchmarks demonstrating superior accuracy over state-of-art. Semantic Video Retrieval seeks to match complex activities in a surveillance video to user described queries. In surveillance scenarios with noise and clutter usually present, visual uncertainties introduced by error-prone low-level detectors, classifiers and trackers compose a significant part of the semantic gap between user defined queries and the archive video. To bridge the gap, we propose a novel probabilistic activity localization formulation that incorporates learning of object attributes, between-object relationships, and object re-identification without activity-level training data. Our experiments demonstrate that the introduction of similarity learning components effectively compensate for noise and error in previous stages, and result in preferable performance on both aerial and ground surveillance videos. Considering the computational complexity of our similarity learning models, we attempt to develop a way of training complicated models efficiently while remaining good performance. As a proof-of-concept, we propose training deep neural networks for supervised learning of hash codes. With slight changes in the optimization formulation, we could explore the possibilities of incorporating the training framework for Person Re-Identification and related problems.2019-07-09T00:00:00

Boston University Institutional Repository (OpenBU)

Similarity learning for person re-identification and semantic video retrieval

Author: Démeunier Jean Nicolas
Gaultier de Biauzat Jean-François
Mirabeau Honoré-Gabriel Riquetti, comte de
Target Guy Jean-Baptiste
Thouret Jacques Guillaume
Verdet Louis
Publication venue
Publication date: 01/01/1877
Field of study

Boston University Institutional Repository (OpenBU)

Recommended from our members

Automatic age progression and estimation from faces

Author: Bukar Ali M.
Publication venue: Faculty of Engineering and Informatics
Publication date: 01/01/2017
Field of study

Recently, automatic age progression has gained popularity due to its numerous applications. Among these is the frequent search for missing people, in the UK alone up to 300,000 people are reported missing every year. Although many algorithms have been proposed, most of the methods are affected by image noise, illumination variations, and facial expressions. Furthermore, most of the algorithms use a pattern caricaturing approach which infers ages by manipulating the target image and a template face formed by averaging faces at the intended age. To this end, this thesis investigates the problem with a view to tackling the most prominent issues associated with the existing algorithms. Initially using active appearance models (AAM), facial features are extracted and mapped to people’s ages, afterward a formula is derived which allows the convenient generation of age progressed images irrespective of whether the intended age exists in the training database or not. In order to handle image noise as well as varying facial expressions, a nonlinear appearance model called kernel appearance model (KAM) is derived. To illustrate the real application of automatic age progression, both AAM and KAM based algorithms are then used to synthesise faces of two popular long missing British and Irish kids; Ben Needham and Mary Boyle. However, both statistical techniques exhibit image rendering artefacts such as low-resolution output and the generation of inconsistent skin tone. To circumvent this problem, a hybrid texture enhancement pipeline is developed. To further ensure that the progressed images preserve people’s identities while at the same time attaining the intended age, rigorous human and machine based tests are conducted; part of this tests resulted to the development of a robust age estimation algorithm. Eventually, the results of the rigorous assessment reveal that the hybrid technique is able to handle all existing problems of age progression with minimal error.National Information Technology Development Agency of Nigeria (NITDA

Bradford Scholars

On the relations among temporal integration for loudness, loudness discrimination, and the form of the loudness function. (A)

Author: Buus Søren
Florentine M
Poulsen Torben
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/1996
Field of study

Crossref

Online Research Database In Technology

Tensor cross-view quadratic discriminant analysis for kinship verification in the wild

Author: Benakcha A. (Abdelhamid)
Hadid A. (Abdenour)
Laiadi O. (Oualid)
Ouamane A. (Abdelmalik)
Taleb-Ahmed A. (Abdelmalik)
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Abstract This paper presents a new Tensor Cross-view Quadratic Discriminant Analysis (TXQDA) method based on the XQDA method for kinship verification in the wild. Many researchers used metric learning methods and have achieved reasonably good performance in kinship verification, none of these methods looks at the kinship verification as a cross-view matching problem. To tackle this issue, we propose a tensor cross-view method to train multilinear data using local histograms of local features descriptors. Therefore, we learn a hierarchical tensor transformation to project each pair face images into the same implicit feature space, in which the distance of each positive pair is minimized and that of each negative pair is maximized. Moreover, TXQDA was proposed to separate the multifactor structure of face images (i.e. kinship, age, gender, expression, illumination and pose) from different dimensions of the tensor. Thus, our TXQDA achieves better classification results through discovering a lowdimensional tensor subspace that enlarges the margin of different kin relation classes. Experimental evaluation on five challenging databases namely Cornell KinFace, UB KinFace, TSKinFace, KinFaceW-II and FIW databases, show that the proposed TXQDA significantly outperforms the current state of the art

HAL Descartes

University of Oulu Repository - Jultika

Hal-Diderot

Bulletin of the University of New Hampshire. Graduate catalog 1987-1989.

Author: University of New Hampshire
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/1987
Field of study

UNH Scholars' Repository