3,754 research outputs found
Latent Semantic Learning with Structured Sparse Representation for Human Action Recognition
This paper proposes a novel latent semantic learning method for extracting
high-level features (i.e. latent semantics) from a large vocabulary of abundant
mid-level features (i.e. visual keywords) with structured sparse
representation, which can help to bridge the semantic gap in the challenging
task of human action recognition. To discover the manifold structure of
midlevel features, we develop a spectral embedding approach to latent semantic
learning based on L1-graph, without the need to tune any parameter for graph
construction as a key step of manifold learning. More importantly, we construct
the L1-graph with structured sparse representation, which can be obtained by
structured sparse coding with its structured sparsity ensured by novel L1-norm
hypergraph regularization over mid-level features. In the new embedding space,
we learn latent semantics automatically from abundant mid-level features
through spectral clustering. The learnt latent semantics can be readily used
for human action recognition with SVM by defining a histogram intersection
kernel. Different from the traditional latent semantic analysis based on topic
models, our latent semantic learning method can explore the manifold structure
of mid-level features in both L1-graph construction and spectral embedding,
which results in compact but discriminative high-level features. The
experimental results on the commonly used KTH action dataset and unconstrained
YouTube action dataset show the superior performance of our method.Comment: The short version of this paper appears in ICCV 201
Feedback-prop: Convolutional Neural Network Inference under Partial Evidence
We propose an inference procedure for deep convolutional neural networks
(CNNs) when partial evidence is available. Our method consists of a general
feedback-based propagation approach (feedback-prop) that boosts the prediction
accuracy for an arbitrary set of unknown target labels when the values for a
non-overlapping arbitrary set of target labels are known. We show that existing
models trained in a multi-label or multi-task setting can readily take
advantage of feedback-prop without any retraining or fine-tuning. Our
feedback-prop inference procedure is general, simple, reliable, and works on
different challenging visual recognition tasks. We present two variants of
feedback-prop based on layer-wise and residual iterative updates. We experiment
using several multi-task models and show that feedback-prop is effective in all
of them. Our results unveil a previously unreported but interesting dynamic
property of deep CNNs. We also present an associated technical approach that
takes advantage of this property for inference under partial evidence in
general visual recognition tasks.Comment: Accepted to CVPR 201
Sharing Video Emotional Information in the Web
Video growth over the Internet changed the way users search, browse and view video content. Watching
movies over the Internet is increasing and becoming a pastime. The possibility of streaming Internet content
to TV, advances in video compression techniques and video streaming have turned this recent modality of
watching movies easy and doable. Web portals as a worldwide mean of multimedia data access need to have
their contents properly classified in order to meet users’ needs and expectations. The authors propose a set of semantic descriptors based on both user physiological signals, captured while watching videos, and on video low-level features extraction. These XML based descriptors contribute to the creation of automatic affective meta-information that will not only enhance a web-based video recommendation system based in emotional information, but also enhance search and retrieval of videos affective content from both users’ personal classifications and content classifications in the context of a web portal.info:eu-repo/semantics/publishedVersio
New horizons in the study of child language acquisition
URL to paper on conference site.Naturalistic longitudinal recordings of child development promise to reveal fresh perspectives on fundamental questions of language acquisition. In a pilot effort, we have recorded 230,000 hours of audio-video recordings spanning the first three years of one child's life at home. To study a corpus of this scale and richness, current methods of developmental cognitive science are inadequate. We are developing new methods for data analysis and interpretation that combine pattern recognition algorithms with interactive user interfaces and data visualization. Preliminary speech analysis reveals surprising levels of linguistic fine-tuning by caregivers that may provide crucial support for word learning. Ongoing analyses of the corpus aim to model detailed aspects of the child's language development as a function of learning mechanisms combined with lifetime experience. Plans to collect similar corpora from more children based on a transportable recording system are underway.National Science Foundation (U.S.)MIT Center for Future BankingMassachusetts Institute of Technology. Media LaboratoryUnited States. Office of Naval ResearchUnited States. Dept. of Defens
Cognitive Representation Learning of Self-Media Online Article Quality
The automatic quality assessment of self-media online articles is an urgent
and new issue, which is of great value to the online recommendation and search.
Different from traditional and well-formed articles, self-media online articles
are mainly created by users, which have the appearance characteristics of
different text levels and multi-modal hybrid editing, along with the potential
characteristics of diverse content, different styles, large semantic spans and
good interactive experience requirements. To solve these challenges, we
establish a joint model CoQAN in combination with the layout organization,
writing characteristics and text semantics, designing different representation
learning subnetworks, especially for the feature learning process and
interactive reading habits on mobile terminals. It is more consistent with the
cognitive style of expressing an expert's evaluation of articles. We have also
constructed a large scale real-world assessment dataset. Extensive experimental
results show that the proposed framework significantly outperforms
state-of-the-art methods, and effectively learns and integrates different
factors of the online article quality assessment.Comment: Accepted at the Proceedings of the 28th ACM International Conference
on Multimedi
- …