6,733 research outputs found
Inexpensive fusion methods for enhancing feature detection
Recent successful approaches to high-level feature detection in image and video data have treated the problem as a pattern classification task. These typically leverage the techniques learned from statistical machine learning, coupled with ensemble architectures that create multiple feature detection models. Once created, co-occurrence between learned features can be captured to further boost performance. At multiple stages throughout these frameworks, various pieces of evidence can be fused together in order to boost performance. These approaches whilst very successful are computationally expensive, and depending on the task, require the use of significant computational resources. In this paper we propose two fusion methods that aim to combine the output of an initial basic statistical machine learning approach with a lower-quality information source, in order to gain diversity in the classified results whilst requiring only modest computing resources. Our approaches, validated experimentally on TRECVid data, are designed to be complementary to existing frameworks and can be regarded as possible replacements for the more computationally expensive combination strategies used elsewhere
Evaluation of trackers for Pan-Tilt-Zoom Scenarios
Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in
computer vision for many years. Compared to tracking with a still camera, the
images captured with a PTZ camera are highly dynamic in nature because the
camera can perform large motion resulting in quickly changing capture
conditions. Furthermore, tracking with a PTZ camera involves camera control to
position the camera on the target. For successful tracking and camera control,
the tracker must be fast enough, or has to be able to predict accurately the
next position of the target. Therefore, standard benchmarks do not allow to
assess properly the quality of a tracker for the PTZ scenario. In this work, we
use a virtual PTZ framework to evaluate different tracking algorithms and
compare their performances. We also extend the framework to add target position
prediction for the next frame, accounting for camera motion and processing
delays. By doing this, we can assess if predicting can make long-term tracking
more robust as it may help slower algorithms for keeping the target in the
field of view of the camera. Results confirm that both speed and robustness are
required for tracking under the PTZ scenario.Comment: 6 pages, 2 figures, International Conference on Pattern Recognition
and Artificial Intelligence 201
Unsupervised Graph-based Rank Aggregation for Improved Retrieval
This paper presents a robust and comprehensive graph-based rank aggregation
approach, used to combine results of isolated ranker models in retrieval tasks.
The method follows an unsupervised scheme, which is independent of how the
isolated ranks are formulated. Our approach is able to combine arbitrary
models, defined in terms of different ranking criteria, such as those based on
textual, image or hybrid content representations.
We reformulate the ad-hoc retrieval problem as a document retrieval based on
fusion graphs, which we propose as a new unified representation model capable
of merging multiple ranks and expressing inter-relationships of retrieval
results automatically. By doing so, we claim that the retrieval system can
benefit from learning the manifold structure of datasets, thus leading to more
effective results. Another contribution is that our graph-based aggregation
formulation, unlike existing approaches, allows for encapsulating contextual
information encoded from multiple ranks, which can be directly used for
ranking, without further computations and post-processing steps over the
graphs. Based on the graphs, a novel similarity retrieval score is formulated
using an efficient computation of minimum common subgraphs. Finally, another
benefit over existing approaches is the absence of hyperparameters.
A comprehensive experimental evaluation was conducted considering diverse
well-known public datasets, composed of textual, image, and multimodal
documents. Performed experiments demonstrate that our method reaches top
performance, yielding better effectiveness scores than state-of-the-art
baseline methods and promoting large gains over the rankers being fused, thus
demonstrating the successful capability of the proposal in representing queries
based on a unified graph-based model of rank fusions
- …