Search CORE

962 research outputs found

Recommended from our members

An experimental comparison of a genetic algorithm and a hill-climber for term selection

Author: MacFarlane A.
May P.
Secker A.
Timmis J.
Publication venue: 'Emerald'
Publication date: 01/01/2010
Field of study

Purpose – The term selection problem for selecting query terms in information filtering and routing has been investigated using hill-climbers of various kinds, largely through the Okapi experiments in the TREC series of conferences. Although these are simple deterministic approaches which examine the effect of changing the weight of one term at a time, they have been shown to improve the retrieval effectiveness of filtering queries in these TREC experiments. Hill-climbers are, however, likely to get trapped in local optima, and the use of more sophisticated local search techniques for this problem that attempt to break out of these optima are worth investigating. To this end, we apply a genetic algorithm (GA) to the same problem. Design/Methodology/Approach – We use a standard TREC test collection from the TREC-8 filtering track, recording mean average precision and recall measures to allow comparison between the hillclimber and GA algorithms. We also vary elements of the GA, such as probability of a word being included, probability of mutation and population size in order to measure the effect of these variables. Different strategies such as Elitist and Non-Elitist methods are used, as well as Roulette Wheel and Rank selection GA algorithms. Findings – The results of tests suggest that both techniques are, on average, better than the baseline, but the implemented GA does not match the overall performance of a hill-climber. The Rank selection algorithm does better on average than the Roulette Wheel algorithm. There is no evidence in this study that varying word inclusion probability, mutation probability or Elitist method make much difference to the overall results. Small population sizes do not appear to be as effective as larger population sizes. Research limitations/implications – The evidence provided here would suggest that being stuck in a local optima for the term selection optimization problem does not appear to be detrimental to the overall success of the hill-climber. The evidence from term rank order would appear to provide extra useful evidence which hill-climbers can use efficiently and effectively to narrow the search space. Originality/Value – The paper represents the first attempt to compare hill-climbers with GAs on a problem of this type

City Research Online

Crossref

Aberystwyth Research Portal

Multiple Retrieval Models and Regression Models for Prior Art Search

Author: Lopez Patrice
Romary Laurent
Publication venue
Publication date: 01/01/2009
Field of study

This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on multiple regression models using an additional validation set created from the patent collection. 3. The exploitation of patent metadata and of the citation structures for creating restricted initial working sets of patents and for producing a final re-ranking regression model. As we exploit specific metadata of the patent documents and the citation relations only at the creation of initial working sets and during the final post ranking step, our architecture remains generic and easy to extend

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1

The impact of the global financial crisis on mining in Katanga

Author: Cuvelier Jeroen
Publication venue: International Peace Information Service (IPIS)
Publication date: 01/01/2009
Field of study

Ghent University Academic Bibliography

From discourse to practice: a sharper perspective on the relationship between minerals and violence in DR Congo

Author: Perks Rachel
Vlassenroot Koen
Publication venue: International Alert
Publication date: 01/01/2010
Field of study

Ghent University Academic Bibliography

DCU's experiments in NTCIR-8 IR4QA task

Author: Jiang Jie
Jones Gareth J.F.
Leveling Johannes
Min Jinming
Way Andy
Publication venue: 'National Institute of Informatics (NII)'
Publication date: 01/01/2010
Field of study

We describe DCU's participation in the NTCIR-8 IR4QA task [16]. This task is a cross-language information retrieval(CLIR) task from English to Simplified Chinese which seeks to provide relevant documents for later cross language question answering (CLQA) tasks. For the IR4QA task, we submitted 5 official runs including two monolingual runs and three CLIR runs. For the monolingual retrieval we tested two information retrieval models. The results show that the KL-Divergence language model method performs better than the Okapi BM25 model for the Simplified Chinese retrieval task. This agrees with our previous CLIR experimental results at NTCIR-5. For the CLIR task, we compare query translation and document translation methods. In the query translation based runs, we tested a method for query expansion from external resource (QEE) before query translation. Our result for this run is slightly lower than the run without QEE. Our results show that the document translation method achieves 68.24% MAP performance compared to our best query translation run. For the document translation method, we found that the main issue is the lack of named entity translation in the documents since we do not have a suitable parallel corpus for training data for the statistical machine translation system. Our best CLIR run comes from the combination of query translation using Google translate and the KL-Divergence language model retrieval method. It achieves 79.94% MAP relative to our best monolingual run

DCU Online Research Access Service

Recommended from our members

Okapi-based XML indexing

Author: Lu W.
MacFarlane A.
Venuti F.
Publication venue: 'Emerald'
Publication date: 18/09/2009
Field of study

Purpose – Being an important data exchange and information storage standard, XML has generated a great deal of interest and particular attention has been paid to the issue of XML indexing. Clear use cases for structured search in XML have been established. However, most of the research in the area is either based on relational database systems or specialized semi‐structured data management systems. This paper aims to propose a method for XML indexing based on the information retrieval (IR) system Okapi. Design/methodology/approach – First, the paper reviews the structure of inverted files and gives an overview of the issues of why this indexing mechanism cannot properly support XML retrieval, using the underlying data structures of Okapi as an example. Then the paper explores a revised method implemented on Okapi using path indexing structures. The paper evaluates these index structures through the metrics of indexing run time, path search run time and space costs using the INEX and Reuters RVC1 collections. Findings – Initial results on the INEX collections show that there is a substantial overhead in space costs for the method, but this increase does not affect run time adversely. Indexing results on differing sized Reuters RVC1 sub‐collections show that the increase in space costs with increasing the size of a collection is significant, but in terms of run time the increase is linear. Path search results show sub‐millisecond run times, demonstrating minimal overhead for XML search. Practical implications – Overall, the results show the method implemented to support XML search in a traditional IR system such as Okapi is viable. Originality/value – The paper provides useful information on a method for XML indexing based on the IR system Okapi

City Research Online

Crossref

Lucene4IR: Developing information retrieval evaluation resources using Lucene

Author: Alkhawaldeh Rami S.
Azzopardi Leif
Balog Krisztian
Ceccarelli Diego
Di Buccio Emanuele
Fernández-Luna Juan M.
Halvey Martin
Hull Charlie
Mannix Jake
Moshfeghi Yashar
Palchowdhury Sauparna
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

The workshop and hackathon on developing Information Retrieval Evaluation Resources using Lucene (L4IR) was held on the 8th and 9th of September, 2016 at the University of Strathclyde in Glasgow, UK and funded by the ESF Elias Network. The event featured three main elements: (i) a series of keynote and invited talks on industry, teaching and evaluation; (ii) planning, coding and hacking where a number of groups created modules and infrastructure to use Lucene to undertake TREC based evaluations; and (iii) a number of breakout groups discussing challenges, opportunities and problems in bridging the divide between academia and industry, and how we can use Lucene for teaching and learning Information Retrieval (IR). The event was composed of a mix and blend of academics, experts and students wanting to learn, share and create evaluation resources for the community. The hacking was intense and the discussions lively creating the basis of many useful tools but also raising numerous issues. It was clear that by adopting and contributing to most widely used and supported Open Source IR toolkit, there were many benefits for academics, students, researchers, developers and practitioners - providing a basis for stronger evaluation practices, increased reproducibility, more efficient knowledge transfer, greater collaboration between academia and industry, and shared teaching and training resources

University of Strathclyde Institutional Repository

Enlighten

Archivio istituzionale della ricerca - Università di Padova

Supporting aspect-based video browsing - analysis of a user study

Author: Elliott D.
Hannah D.
Hopfgartner F.
Jose J.M.
Urruty T.
Publication venue
Publication date: 01/01/2009
Field of study

In this paper, we present a novel video search interface based on the concept of aspect browsing. The proposed strategy is to assist the user in exploratory video search by actively suggesting new query terms and video shots. Our approach has the potential to narrow the "Semantic Gap" issue by allowing users to explore the data collection. First, we describe a clustering technique to identify potential aspects of a search. Then, we use the results to propose suggestions to the user to help them in their search task. Finally, we analyse this approach by exploiting the log files and the feedbacks of a user study

CiteSeerX

Enlighten

A Multi-criteria Decision Support System for Ph.D. Supervisor Selection: A Hybrid Approach

Author: Hasan Mir Anamul
Schwartz Daniel
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2019
Field of study

Selection of a suitable Ph.D. supervisor is a very important step in a student’s career. This paper presents a multi-criteria decision support system to assist students in making this choice. The system employs a hybrid method that first utilizes a fuzzy analytic hierarchy process to extract the relative importance of the identified criteria and sub-criteria to consider when selecting a supervisor. Then, it applies an information retrieval-based similarity algorithm (TF/IDF or Okapi BM25) to retrieve relevant candidate supervisor profiles based on the student’s research interest. The selected profiles are then re-ranked based on other relevant factors chosen by the user, such as publication record, research grant record, and collaboration record. The ranking method evaluates the potential supervisors objectively based on various metrics that are defined in terms of detailed domain-specific knowledge, making part of the decision making automatic. In contrast with other existing works, this system does not require the professor’s involvement and no subjective measures are employed

Crossref

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)