28,528 research outputs found
Video Registration in Egocentric Vision under Day and Night Illumination Changes
With the spread of wearable devices and head mounted cameras, a wide range of
application requiring precise user localization is now possible. In this paper
we propose to treat the problem of obtaining the user position with respect to
a known environment as a video registration problem. Video registration, i.e.
the task of aligning an input video sequence to a pre-built 3D model, relies on
a matching process of local keypoints extracted on the query sequence to a 3D
point cloud. The overall registration performance is strictly tied to the
actual quality of this 2D-3D matching, and can degrade if environmental
conditions such as steep changes in lighting like the ones between day and
night occur. To effectively register an egocentric video sequence under these
conditions, we propose to tackle the source of the problem: the matching
process. To overcome the shortcomings of standard matching techniques, we
introduce a novel embedding space that allows us to obtain robust matches by
jointly taking into account local descriptors, their spatial arrangement and
their temporal robustness. The proposal is evaluated using unconstrained
egocentric video sequences both in terms of matching quality and resulting
registration performance using different 3D models of historical landmarks. The
results show that the proposed method can outperform state of the art
registration algorithms, in particular when dealing with the challenges of
night and day sequences
Large scale evaluations of multimedia information retrieval: the TRECVid experience
Information Retrieval is a supporting technique which underpins a broad range of content-based applications including retrieval, filtering, summarisation, browsing, classification, clustering, automatic linking, and others. Multimedia information retrieval (MMIR) represents those applications when applied to multimedia information such as image, video, music, etc. In this presentation and extended abstract we are primarily concerned with MMIR as applied to information in digital video format. We begin with a brief overview of large scale evaluations of IR tasks in areas such as text, image and music, just to illustrate that this phenomenon is not just restricted to MMIR on video. The main contribution, however, is a set of pointers and a summarisation of the work done as part of TRECVid, the annual benchmarking exercise for video retrieval tasks
An automatic visual analysis system for tennis
This article presents a novel video analysis system for coaching tennis players of all levels, which uses computer vision algorithms to automatically edit and index tennis videos into meaningful annotations.
Existing tennis coaching software lacks the ability to automatically index a tennis match into key events, and therefore, a coach who uses existing software is burdened with time-consuming manual video editing. This work aims to explore the effectiveness of a system to automatically detect tennis events. A secondary aim of this work is to explore the bene- fits coaches experience in using an event retrieval system to retrieve the automatically indexed events. It was found that automatic event detection can significantly improve the experience of using video feedback as part of an instructional coaching session. In addition to the automatic detection of key tennis events, player and ball movements are automati- cally tracked throughout an entire match and this wealth of data allows users to find interesting patterns in play. Player and ball movement information are integrated with the automatically detected tennis events, and coaches can query the data to retrieve relevant key points during a match or analyse player patterns that need attention. This coaching software system allows coaches to build advanced queries, which cannot be facilitated with existing video coaching solutions, without tedious manual indexing. This article proves that the event detection algorithms in this work can detect the main events in tennis with an average precision and recall of 0.84 and 0.86, respectively, and can typically eliminate man- ual indexing of key tennis events
Toward next generation coaching tools for court based racquet sports
Even with today’s advances in automatic indexing of multimedia content, existing coaching tools for court sports lack the ability to automatically index a competitive match into key events. This paper proposes an automatic event indexing and event retrieval system
for tennis, which can be used to coach from beginners upwards.
Event indexing is possible using either visual or inertial sensing, with the latter potentially providing system portability. To achieve maximum performance in event indexing, multi-sensor data integration is implemented, where data from both sensors is merged to automatically index key tennis events. A complete event retrieval
system is also presented to allow coaches to build advanced queries which existing sports coaching solutions cannot facilitate without an inordinate amount of manual indexing
Image Parsing with a Wide Range of Classes and Scene-Level Context
This paper presents a nonparametric scene parsing approach that improves the
overall accuracy, as well as the coverage of foreground classes in scene
images. We first improve the label likelihood estimates at superpixels by
merging likelihood scores from different probabilistic classifiers. This boosts
the classification performance and enriches the representation of
less-represented classes. Our second contribution consists of incorporating
semantic context in the parsing process through global label costs. Our method
does not rely on image retrieval sets but rather assigns a global likelihood
estimate to each label, which is plugged into the overall energy function. We
evaluate our system on two large-scale datasets, SIFTflow and LMSun. We achieve
state-of-the-art performance on the SIFTflow dataset and near-record results on
LMSun.Comment: Published at CVPR 2015, Computer Vision and Pattern Recognition
(CVPR), 2015 IEEE Conference o
- …