8,963 research outputs found
Unsupervised Graph-based Rank Aggregation for Improved Retrieval
This paper presents a robust and comprehensive graph-based rank aggregation
approach, used to combine results of isolated ranker models in retrieval tasks.
The method follows an unsupervised scheme, which is independent of how the
isolated ranks are formulated. Our approach is able to combine arbitrary
models, defined in terms of different ranking criteria, such as those based on
textual, image or hybrid content representations.
We reformulate the ad-hoc retrieval problem as a document retrieval based on
fusion graphs, which we propose as a new unified representation model capable
of merging multiple ranks and expressing inter-relationships of retrieval
results automatically. By doing so, we claim that the retrieval system can
benefit from learning the manifold structure of datasets, thus leading to more
effective results. Another contribution is that our graph-based aggregation
formulation, unlike existing approaches, allows for encapsulating contextual
information encoded from multiple ranks, which can be directly used for
ranking, without further computations and post-processing steps over the
graphs. Based on the graphs, a novel similarity retrieval score is formulated
using an efficient computation of minimum common subgraphs. Finally, another
benefit over existing approaches is the absence of hyperparameters.
A comprehensive experimental evaluation was conducted considering diverse
well-known public datasets, composed of textual, image, and multimodal
documents. Performed experiments demonstrate that our method reaches top
performance, yielding better effectiveness scores than state-of-the-art
baseline methods and promoting large gains over the rankers being fused, thus
demonstrating the successful capability of the proposal in representing queries
based on a unified graph-based model of rank fusions
Leveraging Expert Models for Training Deep Neural Networks in Scarce Data Domains: Application to Offline Handwritten Signature Verification
This paper introduces a novel approach to leverage the knowledge of existing
expert models for training new Convolutional Neural Networks, on domains where
task-specific data are limited or unavailable. The presented scheme is applied
in offline handwritten signature verification (OffSV) which, akin to other
biometric applications, suffers from inherent data limitations due to
regulatory restrictions. The proposed Student-Teacher (S-T) configuration
utilizes feature-based knowledge distillation (FKD), combining graph-based
similarity for local activations with global similarity measures to supervise
student's training, using only handwritten text data. Remarkably, the models
trained using this technique exhibit comparable, if not superior, performance
to the teacher model across three popular signature datasets. More importantly,
these results are attained without employing any signatures during the feature
extraction training process. This study demonstrates the efficacy of leveraging
existing expert models to overcome data scarcity challenges in OffSV and
potentially other related domains
Persistence Bag-of-Words for Topological Data Analysis
Persistent homology (PH) is a rigorous mathematical theory that provides a
robust descriptor of data in the form of persistence diagrams (PDs). PDs
exhibit, however, complex structure and are difficult to integrate in today's
machine learning workflows. This paper introduces persistence bag-of-words: a
novel and stable vectorized representation of PDs that enables the seamless
integration with machine learning. Comprehensive experiments show that the new
representation achieves state-of-the-art performance and beyond in much less
time than alternative approaches.Comment: Accepted for the Twenty-Eight International Joint Conference on
Artificial Intelligence (IJCAI-19). arXiv admin note: substantial text
overlap with arXiv:1802.0485
- …