4,770 research outputs found
High-level feature detection from video in TRECVid: a 5-year retrospective of achievements
Successful and effective content-based access to digital
video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like
colour, texture, or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one which determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically however, this depends on being able to determine whether each feature is or is not present in a video clip.
The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarise the work
done on the TRECVid high-level feature task, showing the
progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can
achieve large-scale, fast and reliable high-level feature detection on video
Action Recognition in Tennis Videos using Optical Flow and Conditional Random Fields
The aim of Action Recognition is the automated analysis and interpretation of events in video sequences. As result of the applications that can be developed, and the widespread availability and popularization of digital video (security cameras, monitoring, social networks, among many other), this area is currently the focus of a strong and wide research interest in various domains such as video security, humancomputer interaction, patient monitoring and video retrieval, among others.
Our long-term goal is to develop automatic action identification in video sequences using Conditional Random Fields (CRFs). In this work we focus, as a case of study, in the identification of a limited set of tennis shots during tennis matches. Three challenges have been addressed: player tracking, player movements representation and action recognition.
Video processing techniques are used to generate textual tags in specific frames, and then the CRFs are used as a classifier to recognise the actions performed in those frames. The preliminary results appear to be quite promising.Sociedad Argentina de Informática e Investigación Operativ
Unconstrained Face Detection and Open-Set Face Recognition Challenge
Face detection and recognition benchmarks have shifted toward more difficult
environments. The challenge presented in this paper addresses the next step in
the direction of automatic detection and identification of people from outdoor
surveillance cameras. While face detection has shown remarkable success in
images collected from the web, surveillance cameras include more diverse
occlusions, poses, weather conditions and image blur. Although face
verification or closed-set face identification have surpassed human
capabilities on some datasets, open-set identification is much more complex as
it needs to reject both unknown identities and false accepts from the face
detector. We show that unconstrained face detection can approach high detection
rates albeit with moderate false accept rates. By contrast, open-set face
recognition is currently weak and requires much more attention.Comment: This is an ERRATA version of the paper originally presented at the
International Joint Conference on Biometrics. Due to a bug in our evaluation
code, the results of the participants changed. The final conclusion, however,
is still the sam
Action Recognition in Tennis Videos using Optical Flow and Conditional Random Fields
The aim of Action Recognition is the automated analysis and interpretation of events in video sequences. As result of the applications that can be developed, and the widespread availability and popularization of digital video (security cameras, monitoring, social networks, among many other), this area is currently the focus of a strong and wide research interest in various domains such as video security, humancomputer interaction, patient monitoring and video retrieval, among others.
Our long-term goal is to develop automatic action identification in video sequences using Conditional Random Fields (CRFs). In this work we focus, as a case of study, in the identification of a limited set of tennis shots during tennis matches. Three challenges have been addressed: player tracking, player movements representation and action recognition.
Video processing techniques are used to generate textual tags in specific frames, and then the CRFs are used as a classifier to recognise the actions performed in those frames. The preliminary results appear to be quite promising.Sociedad Argentina de Informática e Investigación Operativ
Sparsity in Dynamics of Spontaneous Subtle Emotions: Analysis \& Application
Spontaneous subtle emotions are expressed through micro-expressions, which
are tiny, sudden and short-lived dynamics of facial muscles; thus poses a great
challenge for visual recognition. The abrupt but significant dynamics for the
recognition task are temporally sparse while the rest, irrelevant dynamics, are
temporally redundant. In this work, we analyze and enforce sparsity constrains
to learn significant temporal and spectral structures while eliminate
irrelevant facial dynamics of micro-expressions, which would ease the challenge
in the visual recognition of spontaneous subtle emotions. The hypothesis is
confirmed through experimental results of automatic spontaneous subtle emotion
recognition with several sparsity levels on CASME II and SMIC, the only two
publicly available spontaneous subtle emotion databases. The overall
performances of the automatic subtle emotion recognition are boosted when only
significant dynamics are preserved from the original sequences.Comment: IEEE Transaction of Affective Computing (2016
ViSOR: VIdeo Surveillance On-line Repository for annotation retrieval
Aim of the Visor Project [1] is to gather and make freely available a repository of surveillance and video footages for the research community on pattern recognition and multimedia retrieval. Th
- …