Search CORE

13,024 research outputs found

Latent Semantic Learning with Structured Sparse Representation for Human Action Recognition

Author: Balasubramanian
Belkin
Blei
Cheng
Donoho
Hofmann
Jenatton
Lafon
Liu
Lu
Niebles
Olshausen
Parameswaran
Tibshirani
Turaga
Wang
Wright
Yan
Yuxin Peng
Zhiwu Lu
Publication venue: 'Elsevier BV'
Publication date: 22/09/2011
Field of study

This paper proposes a novel latent semantic learning method for extracting high-level features (i.e. latent semantics) from a large vocabulary of abundant mid-level features (i.e. visual keywords) with structured sparse representation, which can help to bridge the semantic gap in the challenging task of human action recognition. To discover the manifold structure of midlevel features, we develop a spectral embedding approach to latent semantic learning based on L1-graph, without the need to tune any parameter for graph construction as a key step of manifold learning. More importantly, we construct the L1-graph with structured sparse representation, which can be obtained by structured sparse coding with its structured sparsity ensured by novel L1-norm hypergraph regularization over mid-level features. In the new embedding space, we learn latent semantics automatically from abundant mid-level features through spectral clustering. The learnt latent semantics can be readily used for human action recognition with SVM by defining a histogram intersection kernel. Different from the traditional latent semantic analysis based on topic models, our latent semantic learning method can explore the manifold structure of mid-level features in both L1-graph construction and spectral embedding, which results in compact but discriminative high-level features. The experimental results on the commonly used KTH action dataset and unconstrained YouTube action dataset show the superior performance of our method.Comment: The short version of this paper appears in ICCV 201

arXiv.org e-Print Archive

Crossref

Image classification by visual bag-of-words refinement and reduction

Author: Lu Zhiwu
Wang Liwei
Wen Ji-Rong
Publication venue: 'Elsevier BV'
Publication date: 18/01/2015
Field of study

This paper presents a new framework for visual bag-of-words (BOW) refinement and reduction to overcome the drawbacks associated with the visual BOW model which has been widely used for image classification. Although very influential in the literature, the traditional visual BOW model has two distinct drawbacks. Firstly, for efficiency purposes, the visual vocabulary is commonly constructed by directly clustering the low-level visual feature vectors extracted from local keypoints, without considering the high-level semantics of images. That is, the visual BOW model still suffers from the semantic gap, and thus may lead to significant performance degradation in more challenging tasks (e.g. social image classification). Secondly, typically thousands of visual words are generated to obtain better performance on a relatively large image dataset. Due to such large vocabulary size, the subsequent image classification may take sheer amount of time. To overcome the first drawback, we develop a graph-based method for visual BOW refinement by exploiting the tags (easy to access although noisy) of social images. More notably, for efficient image classification, we further reduce the refined visual BOW model to a much smaller size through semantic spectral clustering. Extensive experimental results show the promising performance of the proposed framework for visual BOW refinement and reduction

arXiv.org e-Print Archive

Autoencoding beyond pixels using a learned similarity metric

Author: Larochelle Hugo
Larsen Anders Boesen Lindbo
Sønderby Søren Kaae
Winther Ole
Publication venue
Publication date: 01/01/2016
Field of study

We present an autoencoder that leverages learned representations to better measure similarities in data space. By combining a variational autoencoder with a generative adversarial network we can use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective. Thereby, we replace element-wise errors with feature-wise errors to better capture the data distribution while offering invariance towards e.g. translation. We apply our method to images of faces and show that it outperforms VAEs with element-wise similarity measures in terms of visual fidelity. Moreover, we show that the method learns an embedding in which high-level abstract visual features (e.g. wearing glasses) can be modified using simple arithmetic

arXiv.org e-Print Archive

Copenhagen University Research Information System

Online Research Database In Technology

Recommended from our members

Hierarchical video summarisation in reference frame subspace

Author: Crookes D
Jiang RM
Sadka AH
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

In this paper, a hierarchical video structure summarization approach using Laplacian Eigenmap is proposed, where a small set of reference frames is selected from the video sequence to form a reference subspace to measure the dissimilarity between two arbitrary frames. In the proposed summarization scheme, the shot-level key frames are first detected from the continuity of inter-frame dissimilarity, and the sub-shot level and scene level representative frames are then summarized by using k-mean clustering. The experiment is carried on both test videos and movies, and the results show that in comparison with a similar approach using latent semantic analysis, the proposed approach using Laplacian Eigenmap can achieve a better recall rate in keyframe detection, and gives an efficient hierarchical summarization at sub shot, shot and scene levels subsequently

Brunel University Research Archive

On staying grounded and avoiding Quixotic dead ends

Author: A Caramazza
A Martin
A Martin
A Paivio
AK Engel
AR Damasio
BZ Mahon
BZ Mahon
C Wong
CD Wilson-Mendenhall
D Caligiore
D Casasanto
D Kemmerer
D Kemmerer
D Legrand
DL Drane
DL Drane
E Akpinar
E Machery
E Rosch
EE Smith
EK Papies
F Huettig
F Pulvermüller
F Pulvermüller
F Pulvermüller
F Pulvermüller
F Schrodt
GL Murphy
J Henrich
J Santiago
JA Fodor
JA Hampton
JK O’Regan
JL McClelland
JR Binder
JR Searle
K Grill-Spector
K McRae
K McRae
K Meyer
K Patterson
L Connell
L Connell
L Doren Van
L Talmy
LAM Lebois
Lawrence W. Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
LW Barsalou
M Andrews
M Kiefer
M Kiefer
MA Lambon Ralph
MJ Farah
MJ Yates
MK Tanenhaus
ML Anderson
MM Louwerse
MM Louwerse
N Chomsky
PJ Schwanenflugel
RA Zwaan
RA Zwaan
RL Buckner
RM Braga
RW Langacker
S Harnad
SW Allen
TK Landauer
TT Rogers
V Walsh
WK Simmons
WK Simmons
WR Glaser
X Wang
ZW Pylyshyn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The 15 articles in this special issue on The Representation of Concepts illustrate the rich variety of theoretical positions and supporting research that characterize the area. Although much agreement exists among contributors, much disagreement exists as well, especially about the roles of grounding and abstraction in conceptual processing. I first review theoretical approaches raised in these articles that I believe are Quixotic dead ends, namely, approaches that are principled and inspired but likely to fail. In the process, I review various theories of amodal symbols, their distortions of grounded theories, and fallacies in the evidence used to support them. Incorporating further contributions across articles, I then sketch a theoretical approach that I believe is likely to be successful, which includes grounding, abstraction, flexibility, explaining classic conceptual phenomena, and making contact with real-world situations. This account further proposes that (1) a key element of grounding is neural reuse, (2) abstraction takes the forms of multimodal compression, distilled abstraction, and distributed linguistic representation (but not amodal symbols), and (3) flexible context-dependent representations are a hallmark of conceptual processing

Crossref

Springer - Publisher Connector

PubMed Central

Enlighten