3,299 research outputs found
Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval
Humans use context and scene knowledge to easily localize moving objects in
conditions of complex illumination changes, scene clutter and occlusions. In
this paper, we present a method to leverage human knowledge in the form of
annotated video libraries in a novel search and retrieval based setting to
track objects in unseen video sequences. For every video sequence, a document
that represents motion information is generated. Documents of the unseen video
are queried against the library at multiple scales to find videos with similar
motion characteristics. This provides us with coarse localization of objects in
the unseen video. We further adapt these retrieved object locations to the new
video using an efficient warping scheme. The proposed method is validated on
in-the-wild video surveillance datasets where we outperform state-of-the-art
appearance-based trackers. We also introduce a new challenging dataset with
complex object appearance changes.Comment: Under review with the IEEE Transactions on Circuits and Systems for
Video Technolog
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
Strategies for Searching Video Content with Text Queries or Video Examples
The large number of user-generated videos uploaded on to the Internet
everyday has led to many commercial video search engines, which mainly rely on
text metadata for search. However, metadata is often lacking for user-generated
videos, thus these videos are unsearchable by current search engines.
Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity
problem by directly analyzing the visual and audio streams of each video. CBVR
encompasses multiple research topics, including low-level feature design,
feature fusion, semantic detector training and video search/reranking. We
present novel strategies in these topics to enhance CBVR in both accuracy and
speed under different query inputs, including pure textual queries and query by
video examples. Our proposed strategies have been incorporated into our
submission for the TRECVID 2014 Multimedia Event Detection evaluation, where
our system outperformed other submissions in both text queries and video
example queries, thus demonstrating the effectiveness of our proposed
approaches
Video analytics system for surveillance videos
Developing an intelligent inspection system that can enhance the public safety is challenging. An efficient video analytics system can help monitor unusual events and mitigate possible damage or loss. This thesis aims to analyze surveillance video data, report abnormal activities and retrieve corresponding video clips. The surveillance video dataset used in this thesis is derived from ALERT Dataset, a collection of surveillance videos at airport security checkpoints.
The video analytics system in this thesis can be thought as a pipelined process. The system takes the surveillance video as input, and passes it through a series of processing such as object detection, multi-object tracking, person-bin association and re-identification. In the end, we can obtain trajectories of passengers and baggage in the surveillance videos. Abnormal events like taking away other's belongings will be detected and trigger the alarm automatically. The system could also retrieve the corresponding video clips based on user-defined query
High-level feature detection from video in TRECVid: a 5-year retrospective of achievements
Successful and effective content-based access to digital
video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like
colour, texture, or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one which determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically however, this depends on being able to determine whether each feature is or is not present in a video clip.
The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarise the work
done on the TRECVid high-level feature task, showing the
progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can
achieve large-scale, fast and reliable high-level feature detection on video
Open Set Logo Detection and Retrieval
Current logo retrieval research focuses on closed set scenarios. We argue
that the logo domain is too large for this strategy and requires an open set
approach. To foster research in this direction, a large-scale logo dataset,
called Logos in the Wild, is collected and released to the public. A typical
open set logo retrieval application is, for example, assessing the
effectiveness of advertisement in sports event broadcasts. Given a query sample
in shape of a logo image, the task is to find all further occurrences of this
logo in a set of images or videos. Currently, common logo retrieval approaches
are unsuitable for this task because of their closed world assumption. Thus, an
open set logo retrieval method is proposed in this work which allows searching
for previously unseen logos by a single query sample. A two stage concept with
separate logo detection and comparison is proposed where both modules are based
on task specific CNNs. If trained with the Logos in the Wild data, significant
performance improvements are observed, especially compared with
state-of-the-art closed set approaches.Comment: accepted at VISAPP 201
- …