Search CORE

21 research outputs found

Globally maximizing, locally minimizing : unsupervised discriminant projection with applications to face and palm biometrics

Author: Niu B
Yang J
Yang JY
Zhang DD
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/12/2014
Field of study

2006-2007 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

PolyU Institutional Repository

WDiscOOD: Out-of-Distribution Detection via Whitened Linear Discriminant Analysis

Author: Chen Yiye
Lin Yunzhi
Vela Patricio A.
Xu Ruinian
Publication venue
Publication date: 29/08/2023
Field of study

Deep neural networks are susceptible to generating overconfident yet erroneous predictions when presented with data beyond known concepts. This challenge underscores the importance of detecting out-of-distribution (OOD) samples in the open world. In this work, we propose a novel feature-space OOD detection score based on class-specific and class-agnostic information. Specifically, the approach utilizes Whitened Linear Discriminant Analysis to project features into two subspaces - the discriminative and residual subspaces - for which the in-distribution (ID) classes are maximally separated and closely clustered, respectively. The OOD score is then determined by combining the deviation from the input data to the ID pattern in both subspaces. The efficacy of our method, named WDiscOOD, is verified on the large-scale ImageNet-1k benchmark, with six OOD datasets that cover a variety of distribution shifts. WDiscOOD demonstrates superior performance on deep classifiers with diverse backbone architectures, including CNN and vision transformer. Furthermore, we also show that WDiscOOD more effectively detects novel concepts in representation spaces trained with contrastive objectives, including supervised contrastive loss and multi-modality contrastive loss.Comment: Accepted by ICCV 2023. Code is available at: https://github.com/ivalab/WDiscOOD.gi

arXiv.org e-Print Archive

Audio-visual football video analysis, from structure detection to attention analysis

Author: Ren Reede
Publication venue
Publication date: 01/01/2008
Field of study

Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic ﬁelds. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop speciﬁc techniques for content-based sports video analysis to utilise these characteristics. For an efﬁcient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identiﬁcation. Replay segments convey the most important contents in sports videos. It is an efﬁcient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a ﬁve-layer adaboost classiﬁer and a logo template matching throughout an entire video. The ﬁve-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to ﬁlter out logo transition candidates. Subsequently, a logo template is constructed and employed to ﬁnd all transition logo sequences. The precision and recall of this system in replay detection is 100% in a ﬁve-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identiﬁed by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a sufﬁx tree is proposed to ﬁnd the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reﬂection bias among modality salient signals and combines these signals by reﬂectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are ﬁlled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can ﬁnd goal events at a high precision. Moreover, results of MAR-based highlight detection on the ﬁnal game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

Glasgow Theses Service

CiteSeerX

OpenGrey Repository

Person Re-identification in Identity Regression Space

Author: Gong S
Wang H
Xiang T
Zhu X
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/06/2018
Field of study

This work was partially supported by the China Scholarship Council, Vision Semantics Ltd, Royal Society Newton Advanced Fellowship Programme (NA150459), and Innovate UK Industrial Challenge Project on Developing and Commercialising Intelligent Video Analytics Solutions for Public Safety (98111-571149)

arXiv.org e-Print Archive

Queen Mary Research Online

Simultaneous model-based clustering and visualization in the Fisher discriminative subspace

Author: A. Jain
A. Montanari
A. Raftery
C. Biernacki
C. Biernacki
C. Bishop
C. Bouveyron
C. Fraley
C. Maugis
Camille Brunet
Charles Bouveyron
D. Foley
D. Rubin
D. Scott
D.A. Clausi
E. Anderson
E. Tipping
G. Celeux
G. Celeux
G. Golub
G. Kimeldorf
G. McLachlan
G. McLachlan
G. McLachlan
G. Schwarz
H. Akaike
I. Jolliffe
J. Baek
J. Friedman
J. Ye
J. Ye
K. Fukunaga
K. Liu
L. Parsons
M. Law
N. Campbell
N. Trendafilov
P. Howland
P. McNicholas
R. Agrawal
R. Bellman
R. Duda
R. Fisher
S. Boutemedjet
T. Alexandrov
T. Hastie
T. Hastie
W. Krzanowski
Y. Hamamoto
Y.F. Guo
Z. Jin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/04/2011
Field of study

Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 12 parsimonious DLM models is exhibited which allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets show that the proposed approach performs better than existing clustering methods while providing a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data

arXiv.org e-Print Archive

HAL Evry

Crossref

HAL-Paris1

A survey of face detection, extraction and recognition

Author: Lu Yongzhong
Yu Shengsheng
Zhou Jingli
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 20/02/2012
Field of study

The goal of this paper is to present a critical survey of existing literatures on human face recognition over the last 4-5 years. Interest and research activities in face recognition have increased significantly over the past few years, especially after the American airliner tragedy on September 11 in 2001. While this growth largely is driven by growing application demands, such as static matching of controlled photographs as in mug shots matching, credit card verification to surveillance video images, identification for law enforcement and authentication for banking and security system access, advances in signal analysis techniques, such as wavelets and neural networks, are also important catalysts. As the number of proposed techniques increases, survey and evaluation becomes important

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant Analysis

Author: Bai Zhaojun
Roh Dong Min
Publication venue
Publication date: 21/11/2022
Field of study

Much like the classical Fisher linear discriminant analysis, Wasserstein discriminant analysis (WDA) is a supervised linear dimensionality reduction method that seeks a projection matrix to maximize the dispersion of different data classes and minimize the dispersion of same data classes. However, in contrast, WDA can account for both global and local inter-connections between data classes using a regularized Wasserstein distance. WDA is formulated as a bi-level nonlinear trace ratio optimization. In this paper, we present a bi-level nonlinear eigenvector (NEPv) algorithm, called WDA-nepv. The inner kernel of WDA-nepv for computing the optimal transport matrix of the regularized Wasserstein distance is formulated as an NEPv, and meanwhile the outer kernel for the trace ratio optimization is also formulated as another NEPv. Consequently, both kernels can be computed efficiently via self-consistent-field iterations and modern solvers for linear eigenvalue problems. Comparing with the existing algorithms for WDA, WDA-nepv is derivative-free and surrogate-model-free. The computational efficiency and applications in classification accuracy of WDA-nepv are demonstrated using synthetic and real-life datasets

arXiv.org e-Print Archive