Search CORE

1,525 research outputs found

Geodesics on the manifold of multivariate generalized Gaussian distributions with an application to multicomponent texture discrimination

Author: Scheunders Paul
Verdoolaege Geert
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

We consider the Rao geodesic distance (GD) based on the Fisher information as a similarity measure on the manifold of zero-mean multivariate generalized Gaussian distributions (MGGD). The MGGD is shown to be an adequate model for the heavy-tailed wavelet statistics in multicomponent images, such as color or multispectral images. We discuss the estimation of MGGD parameters using various methods. We apply the GD between MGGDs to color texture discrimination in several classification experiments, taking into account the correlation structure between the spectral bands in the wavelet domain. We compare the performance, both in terms of texture discrimination capability and computational load, of the GD and the Kullback-Leibler divergence (KLD). Likewise, both uni- and multivariate generalized Gaussian models are evaluated, characterized by a fixed or a variable shape parameter. The modeling of the interband correlation significantly improves classification efficiency, while the GD is shown to consistently outperform the KLD as a similarity measure

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Learning to detect video events from zero or very few video examples

Author: Galanopoulos Damianos
Mezaris Vasileios
Patras Ioannis
Tzelepis Christos
Publication venue: 'Elsevier BV'
Publication date: 25/11/2015
Field of study

In this work we deal with the problem of high-level event detection in video. Specifically, we study the challenging problems of i) learning to detect video events from solely a textual description of the event, without using any positive video examples, and ii) additionally exploiting very few positive training samples together with a small number of ``related'' videos. For learning only from an event's textual description, we first identify a general learning framework and then study the impact of different design choices for various stages of this framework. For additionally learning from example videos, when true positive training samples are scarce, we employ an extension of the Support Vector Machine that allows us to exploit ``related'' event videos by automatically introducing different weights for subsets of the videos in the overall training set. Experimental evaluations performed on the large-scale TRECVID MED 2014 video dataset provide insight on the effectiveness of the proposed methods.Comment: Image and Vision Computing Journal, Elsevier, 2015, accepted for publicatio

arXiv.org e-Print Archive

City Research Online

Crossref

Video Infringement Detection via Feature Disentanglement and Mutual Information Maximization

Author: Dong Jianfeng
He Sifeng
Liu Zhenguang
Ma Zhe
Qian Feng
Wang Ruili
Yang Lei
Ye Shuai
Yu Xinyang
Zhang Xiaobo
Zimmermann Roger
Publication venue
Publication date: 13/09/2023
Field of study

The self-media era provides us tremendous high quality videos. Unfortunately, frequent video copyright infringements are now seriously damaging the interests and enthusiasm of video creators. Identifying infringing videos is therefore a compelling task. Current state-of-the-art methods tend to simply feed high-dimensional mixed video features into deep neural networks and count on the networks to extract useful representations. Despite its simplicity, this paradigm heavily relies on the original entangled features and lacks constraints guaranteeing that useful task-relevant semantics are extracted from the features. In this paper, we seek to tackle the above challenges from two aspects: (1) We propose to disentangle an original high-dimensional feature into multiple sub-features, explicitly disentangling the feature into exclusive lower-dimensional components. We expect the sub-features to encode non-overlapping semantics of the original feature and remove redundant information. (2) On top of the disentangled sub-features, we further learn an auxiliary feature to enhance the sub-features. We theoretically analyzed the mutual information between the label and the disentangled features, arriving at a loss that maximizes the extraction of task-relevant information from the original feature. Extensive experiments on two large-scale benchmark datasets (i.e., SVD and VCSL) demonstrate that our method achieves 90.1% TOP-100 mAP on the large-scale SVD dataset and also sets the new state-of-the-art on the VCSL benchmark dataset. Our code and model have been released at https://github.com/yyyooooo/DMI/, hoping to contribute to the community.Comment: This paper is accepted by ACM MM 202

arXiv.org e-Print Archive

Accelerated Probabilistic Learning Concept for Mining Heterogeneous Earth Observation Images

Author: Alonso Kevin
Datcu Mihai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/06/2015
Field of study

We present an accelerated probabilistic learning concept and its prototype implementation for mining heterogeneous Earth observation images, e.g., multispectral images, synthetic aperture radar (SAR) images, image time series, or geographical information systems (GIS) maps. The system prototype combines, at pixel level, the unsupervised clustering results of different features, extracted from heterogeneous satellite images and geographical information resources, with user-defined semantic annotations in order to calculate the posterior probabilities that allow the final probabilistic searches. The system is able to learn different semantic labels based on a newly developed Bayesian networks algorithm and allows different probabilistic retrieval methods of all semantically related images with only a few user interactions. The new algorithm reduces the computational cost, overperforming existing conventional systems, under certain conditions, by several orders of magnitude. The achieved speed-up allows the introduction of new feature models improving the learning capabilities of knowledge-driven image information mining systems and opening them to Big Data environment

Institute of Transport Research:Publications

Crossref

Clustering-based analysis of semantic concept models for video shots

Author: Koskela Markus
Smeaton Alan F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2006
Field of study

In this paper we present a clustering-based method for representing semantic concepts on multimodal low-level feature spaces and study the evaluation of the goodness of such models with entropy-based methods. As different semantic concepts in video are most accurately represented with different features and modalities, we utilize the relative model-wise confidence values of the feature extraction techniques in weighting them automatically. The method also provides a natural way of measuring the similarity of different concepts in a multimedia lexicon. The experiments of the paper are conducted using the development set of the TRECVID 2005 corpus together with a common annotation for 39 semantic concept

Irish Universities

DCU Online Research Access Service

Graph Regularized Non-negative Matrix Factorization By Maximizing Correntropy

Author: Fan Zhuoyi
Li Le
Xu Yang
Yang Jianjun
Zhang Honggang
Zhao Kaili
Publication venue
Publication date: 09/05/2014
Field of study

Non-negative matrix factorization (NMF) has proved effective in many clustering and classification tasks. The classic ways to measure the errors between the original and the reconstructed matrix are

l_2

distance or Kullback-Leibler (KL) divergence. However, nonlinear cases are not properly handled when we use these error measures. As a consequence, alternative measures based on nonlinear kernels, such as correntropy, are proposed. However, the current correntropy-based NMF only targets on the low-level features without considering the intrinsic geometrical distribution of data. In this paper, we propose a new NMF algorithm that preserves local invariance by adding graph regularization into the process of max-correntropy-based matrix factorization. Meanwhile, each feature can learn corresponding kernel from the data. The experiment results of Caltech101 and Caltech256 show the benefits of such combination against other NMF algorithms for the unsupervised image clustering

arXiv.org e-Print Archive

CiteSeerX

Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance

Author: Do Minh N.
Vetterli Martin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/04/2005
Field of study

We present a statistical view of the texture retrieval problem by combining the two related tasks, namely feature extraction (FE) and similarity measurement (SM), into a joint modeling and classification scheme. We show that using a con- sistent estimator of texture model parameters for the FE step followed by computing the Kullback–Leibler distance (KLD) between estimated models for the SM step is asymptotically optimal in term of retrieval error probability. The statistical scheme leads to a new wavelet-based texture retrieval method that is based on the accurate modeling of the marginal distribution of wavelet coefficients using generalized Gaussian density (GGD) and on the existence a closed form for the KLD between GGDs. The proposed method provides greater accuracy and flexibility in capturing texture information, while its simplified form has a close resemblance with the existing methods which uses energy distribution in the frequency domain to identify textures. Ex- perimental results on a database of 640 texture images indicate that the new method significantly improves retrieval rates, e.g., from 65% to 77%, compared with traditional approaches, while it retains comparable levels of computational complexity

Infoscience - École polytechnique fédérale de Lausanne

Action Recognition in Videos: from Motion Capture Labs to the Web

Author: Ana Paula Br
Arnaldo Albuquerque De Araújo
De Almeida
Eduardo Alves
Jussara Marques
Publication venue
Publication date: 17/06/2010
Field of study

This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

arXiv.org e-Print Archive

CiteSeerX