Search CORE

14 research outputs found

Experiment on Style-Dependent Document Ranking

Author: Braslavski P.
Tselischev A.
Браславский П. И.
Publication venue: б. и.
Publication date: 01/01/2005
Field of study

The paper reports on experiments aimed at incorporating style-dependent parameters into ranking schemata in information retrieval tasks. We use ROMIP Web collection and ROMIP-2003 ad-hoc track results in the analysis. Factor analysis techniques have been used to extract factors that would reflect stylistic properties of documents. Comparison of the obtained style-dependent parameters and their derived ranks is conducted. A simple schema for rank aggregation is proposed. Evaluation of the results shows only moderate improvement of relevance ranking.В работе описывается эксперимент по использованию стилистических параметров в ранжировании документов для задачи информационного поиска. В эксперименте использована Веб-коллекция РОМИП, а также результаты оценки дорожки Веб-поиска РОМИП-2003. Для выделения факторов, отражающих стиль документа, использовались методы факторного анализа. Проведено сравнение полученных стилистических параметров и рангов на их основе. Предложена простая схема агрегации рангов. Оценка результатов показала, что метод может давать только незначительное повышение качества ранжирования

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Inexpensive fusion methods for enhancing feature detection

Author: Adamek Tomasz
O'Connor Noel E.
Smeaton Alan F.
Wilkins Peter
Publication venue: 'Elsevier BV'
Publication date: 01/08/2007
Field of study

Recent successful approaches to high-level feature detection in image and video data have treated the problem as a pattern classification task. These typically leverage the techniques learned from statistical machine learning, coupled with ensemble architectures that create multiple feature detection models. Once created, co-occurrence between learned features can be captured to further boost performance. At multiple stages throughout these frameworks, various pieces of evidence can be fused together in order to boost performance. These approaches whilst very successful are computationally expensive, and depending on the task, require the use of significant computational resources. In this paper we propose two fusion methods that aim to combine the output of an initial basic statistical machine learning approach with a lower-quality information source, in order to gain diversity in the classified results whilst requiring only modest computing resources. Our approaches, validated experimentally on TRECVid data, are designed to be complementary to existing frameworks and can be regarded as possible replacements for the more computationally expensive combination strategies used elsewhere

DCU Online Research Access Service

Linguistic Analysis of Users' Queries: towards an adaptive Information Retrieval System

Author: Mothe Josiane
Tanguy Ludovic
Publication venue: HAL CCSD
Publication date: 01/01/2007
Field of study

International audienceMost of Information Retrieval Systems transform natural language users'queries into bags of words that are matched to documents also represented as bags of words. Through such process, the richness of the query is lost. In this paper we show that linguistic features of a query are good indicators to predict systems failure to answer it. The experiments are based on 42 systems or system variants and 50 TREC topics that consist of a descriptive part expressed in natural language

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

Unités d'indexation et taille des requêtes pour la recherche d'information en français

Author: Mothe Josiane
Tanguy Ludovic
Publication venue: HAL CCSD
Publication date: 01/01/2007
Field of study

International audienceThis paper analyses different indexing method for French (lemmas, stems and truncated terms) as well as their fusing. We also examine the influence of the different section of a topic on precision. Our study uses the collections from CLEF – French monolingual from 2000 to 2005. We show that the best method is the one based on lemmas and that fuse the results obtained with the different sections of a topic.MOTS-CLÉS :recherche d'information, fusion, indexation, influence de l'indexation, recherche d'information en français.Dans cet article, nous nous intéressons à la recherche d'information en Français. Nous analysons différentes techniques d'indexation (basées sur des lemmes, des radicaux ou des termes) et leur fusion. Nous analysons également l'influence de la prise en compte des différentes parties d'une requête. Notre étude porte sur 6 campagnes d'évaluation de CLEF Français. Nous montrons que l'utilisation des lemmes et la combinaison des différentes variantes d'une requête sont les plus efficaces pour améliorer la précision moyenne et la haute précisio

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

Properties of optimally weighted data fusion in CBMIR

Author: Ferguson Paul
Smeaton Alan F.
Wilkins Peter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Content-Based Multimedia Information Retrieval (CBMIR) systems which leverage multiple retrieval experts (En ) of- ten employ a weighting scheme when combining expert re- sults through data fusion. Typically however a query will comprise multiple query images (Im ) leading to potentially N × M weights to be assigned. Because of the large number of potential weights, existing approaches impose a hierarchy for data fusion, such as uniformly combining query image results from a single retrieval expert into a single list and then weighting the results of each expert. In this paper we will demonstrate that this approach is sub-optimal and leads to the poor state of CBMIR performance in benchmarking evaluations. We utilize an optimization method known as Coordinate Ascent to discover the optimal set of weights (|En | · |Im |) which demonstrates a dramatic difference be- tween known results and the theoretical maximum. We find that imposing common combinatorial hierarchies for data fu- sion will half the optimal performance that can be achieved. By examining the optimal weight sets at the topic level, we observe that approximately 15% of the weights (from set |En | · |Im |) for any given query, are assigned 70%-82% of the total weight mass for that topic. Furthermore we discover that the ideal distribution of weights follows a log-normal distribution. We find that we can achieve up to 88% of the performance of fully optimized query using just these 15% of the weights. Our investigation was conducted on TRECVID evaluations 2003 to 2007 inclusive and ImageCLEFPhoto 2007, totalling 181 search topics optimized over a combined collection size of 661,213 images and 1,594 topic images

Crossref

DCU Online Research Access Service

An investigation into weighted data fusion for content-based multimedia information retrieval

Author: Wilkins Peter
Publication venue: Dublin City University. CLARITY: The Centre for Sensor Web Technologies
Publication date: 01/11/2009
Field of study

Content Based Multimedia Information Retrieval (CBMIR) is characterised by the combination of noisy sources of information which, in unison, are able to achieve strong performance. In this thesis we focus on the combination of ranked results from the independent retrieval experts which comprise a CBMIR system through linearly weighted data fusion. The independent retrieval experts are low-level multimedia features, each of which contains an indexing function and ranking algorithm. This thesis is comprised of two halves. In the ﬁrst half, we perform a rigorous empirical investigation into the factors which impact upon performance in linearly weighted data fusion. In the second half, we leverage these ﬁnding to create a new class of weight generation algorithms for data fusion which are capable of determining weights at query-time, such that the weights are topic dependent

Irish Universities

DCU Online Research Access Service