5,023 research outputs found
TopX : efficient and versatile top-k query processing for text, structured, and semistructured data
TopX is a top-k retrieval engine for text and XML data. Unlike Boolean engines, it stops query processing as soon as it can safely determine the k top-ranked result objects according to a monotonous score aggregation function with respect to a multidimensional query. The main contributions of the thesis unfold into four main points, confirmed by previous publications at international conferences or workshops:
• Top-k query processing with probabilistic guarantees.
• Index-access optimized top-k query processing.
• Dynamic and self-tuning, incremental query expansion for top-k query
processing.
• Efficient support for ranked XML retrieval and full-text search.
Our experiments demonstrate the viability and improved efficiency of our approach compared to existing related work for a broad variety of retrieval scenarios.TopX ist eine Top-k Suchmaschine fĂĽr Text und XML Daten. Im Gegensatz
zu Boole\u27; schen Suchmaschinen terminiert TopX die Anfragebearbeitung,
sobald die k besten Ergebnisobjekte im Hinblick auf eine mehrdimensionale
Anfrage gefunden wurden. Die Hauptbeiträge dieser Arbeit teilen sich in
vier Schwerpunkte basierend auf vorherigen Veröffentlichungen bei internationalen
Konferenzen oder Workshops:
• Top-k Anfragebearbeitung mit probabilistischen Garantien.
• Zugriffsoptimierte Top-k Anfragebearbeitung.
• Dynamische und selbstoptimierende, inkrementelle Anfrageexpansion für Top-k Anfragebearbeitung.
• Effiziente Unterstützung für XML-Anfragen und Volltextsuche.
Unsere Experimente bestätigen die Vielseitigkeit und gesteigerte Effizienz unserer Verfahren gegenüber existierenden, führenden Ansätzen für eine weite
Bandbreite von Anwendungen in der Informationssuche
Steering Prototypes with Prompt-tuning for Rehearsal-free Continual Learning
In the context of continual learning, prototypes-as representative class
embeddings-offer advantages in memory conservation and the mitigation of
catastrophic forgetting. However, challenges related to semantic drift and
prototype interference persist. In this study, we introduce the Contrastive
Prototypical Prompt (CPP) approach. Through task-specific prompt-tuning,
underpinned by a contrastive learning objective, we effectively address both
aforementioned challenges. Our evaluations on four challenging
class-incremental benchmarks reveal that CPP achieves a significant 4% to 6%
improvement over state-of-the-art methods. Importantly, CPP operates without a
rehearsal buffer and narrows the performance divergence between continual and
offline joint-learning, suggesting an innovative scheme for Transformer-based
continual learning systems.Comment: Accept to WACV 2024. Code is available at
https://github.com/LzVv123456/Contrastive-Prototypical-Promp
Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation
Remote sensing (RS) image retrieval is of great significant for geological
information mining. Over the past two decades, a large amount of research on
this task has been carried out, which mainly focuses on the following three
core issues: feature extraction, similarity metric and relevance feedback. Due
to the complexity and multiformity of ground objects in high-resolution remote
sensing (HRRS) images, there is still room for improvement in the current
retrieval approaches. In this paper, we analyze the three core issues of RS
image retrieval and provide a comprehensive review on existing methods.
Furthermore, for the goal to advance the state-of-the-art in HRRS image
retrieval, we focus on the feature extraction issue and delve how to use
powerful deep representations to address this task. We conduct systematic
investigation on evaluating correlative factors that may affect the performance
of deep features. By optimizing each factor, we acquire remarkable retrieval
results on publicly available HRRS datasets. Finally, we explain the
experimental phenomenon in detail and draw conclusions according to our
analysis. Our work can serve as a guiding role for the research of
content-based RS image retrieval
From Categories to Classifier: Name-Only Continual Learning by Exploring the Web
Continual Learning (CL) often relies on the availability of extensive
annotated datasets, an assumption that is unrealistically time-consuming and
costly in practice. We explore a novel paradigm termed name-only continual
learning where time and cost constraints prohibit manual annotation. In this
scenario, learners adapt to new category shifts using only category names
without the luxury of annotated training data. Our proposed solution leverages
the expansive and ever-evolving internet to query and download uncurated
webly-supervised data for image classification. We investigate the reliability
of our web data and find them comparable, and in some cases superior, to
manually annotated datasets. Additionally, we show that by harnessing the web,
we can create support sets that surpass state-of-the-art name-only
classification that create support sets using generative models or image
retrieval from LAION-5B, achieving up to 25% boost in accuracy. When applied
across varied continual learning contexts, our method consistently exhibits a
small performance gap in comparison to models trained on manually annotated
datasets. We present EvoTrends, a class-incremental dataset made from the web
to capture real-world trends, created in just minutes. Overall, this paper
underscores the potential of using uncurated webly-supervised data to mitigate
the challenges associated with manual data labeling in continual learning
MapReduce-based Solutions for Scalable SPARQL Querying
The use of RDF to expose semantic data on the Web has seen a dramatic increase over the last few years. Nowadays, RDF datasets are so big and rconnected that, in fact, classical mono-node solutions present significant scalability problems when trying to manage big semantic data. MapReduce, a standard framework for distributed processing of great quantities of data, is earning a place among the distributed solutions facing RDF scalability issues. In this article, we survey the most important works addressing RDF management and querying through diverse MapReduce approaches, with a focus on their main strategies, optimizations and results
Webly Supervised Learning of Convolutional Networks
We present an approach to utilize large amounts of web data for learning
CNNs. Specifically inspired by curriculum learning, we present a two-step
approach for CNN training. First, we use easy images to train an initial visual
representation. We then use this initial CNN and adapt it to harder, more
realistic images by leveraging the structure of data and categories. We
demonstrate that our two-stage CNN outperforms a fine-tuned CNN trained on
ImageNet on Pascal VOC 2012. We also demonstrate the strength of webly
supervised learning by localizing objects in web images and training a R-CNN
style detector. It achieves the best performance on VOC 2007 where no VOC
training data is used. Finally, we show our approach is quite robust to noise
and performs comparably even when we use image search results from March 2013
(pre-CNN image search era)
- …