Search CORE

1,264 research outputs found

Using Apache Lucene to Search Vector of Locally Aggregated Descriptors

Author: Amato Giuseppe
Bolettieri Paolo
Falchi Fabrizio
Gennaro Claudio
Vadicamo Lucia
Publication venue
Publication date: 01/01/2016
Field of study

Surrogate Text Representation (STR) is a profitable solution to efficient similarity search on metric space using conventional text search engines, such as Apache Lucene. This technique is based on comparing the permutations of some reference objects in place of the original metric distance. However, the Achilles heel of STR approach is the need to reorder the result set of the search according to the metric distance. This forces to use a support database to store the original objects, which requires efficient random I/O on a fast secondary memory (such as flash-based storages). In this paper, we propose to extend the Surrogate Text Representation to specifically address a class of visual metric objects known as Vector of Locally Aggregated Descriptors (VLAD). This approach is based on representing the individual sub-vectors forming the VLAD vector with the STR, providing a finer representation of the vector and enabling us to get rid of the reordering phase. The experiments on a publicly available dataset show that the extended STR outperforms the baseline STR achieving satisfactory performance near to the one obtained with the original VLAD vectors.Comment: In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 4: VISAPP, p. 383-39

arXiv.org e-Print Archive

Crossref

Lucene4IR: Developing information retrieval evaluation resources using Lucene

Author: Alkhawaldeh Rami S.
Azzopardi Leif
Balog Krisztian
Ceccarelli Diego
Di Buccio Emanuele
Fernández-Luna Juan M.
Halvey Martin
Hull Charlie
Mannix Jake
Moshfeghi Yashar
Palchowdhury Sauparna
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

The workshop and hackathon on developing Information Retrieval Evaluation Resources using Lucene (L4IR) was held on the 8th and 9th of September, 2016 at the University of Strathclyde in Glasgow, UK and funded by the ESF Elias Network. The event featured three main elements: (i) a series of keynote and invited talks on industry, teaching and evaluation; (ii) planning, coding and hacking where a number of groups created modules and infrastructure to use Lucene to undertake TREC based evaluations; and (iii) a number of breakout groups discussing challenges, opportunities and problems in bridging the divide between academia and industry, and how we can use Lucene for teaching and learning Information Retrieval (IR). The event was composed of a mix and blend of academics, experts and students wanting to learn, share and create evaluation resources for the community. The hacking was intense and the discussions lively creating the basis of many useful tools but also raising numerous issues. It was clear that by adopting and contributing to most widely used and supported Open Source IR toolkit, there were many benefits for academics, students, researchers, developers and practitioners - providing a basis for stronger evaluation practices, increased reproducibility, more efficient knowledge transfer, greater collaboration between academia and industry, and shared teaching and training resources

University of Strathclyde Institutional Repository

Enlighten

Archivio istituzionale della ricerca - Università di Padova

From Frequency to Meaning: Vector Space Models of Semantics

Author: Pantel Patrick
Turney Peter D.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2010
Field of study

Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive

Crossref

A MultiAgent System for Choosing Software Patterns

Author: Birukou Aliaksandr
Blanzieri Enrico
Giorgini Paolo
Weiss Michael
Publication venue
Publication date: 01/10/2006
Field of study

Software patterns enable an efficient transfer of design experience by documenting common solutions to recurring design problems. They contain valuable knowledge that can be reused by others, in particular, by less experienced developers. Patterns have been published for system architecture and detailed design, as well as for specific application domains (e.g. agents and security). However, given the steadily growing number of patterns in the literature and online repositories, it can be hard for non-experts to select patterns appropriate to their needs, or even to be aware of the existing patterns. In this paper, we present a multi-agent system that supports developers in choosing patterns that are suitable for a given design problem. The system implements an implicit culture approach for recommending patterns to developers based on the history of decisions made by other developers regarding which patterns to use in related design problems. The recommendations are complemented with the documents from a pattern repository that can be accessed by the agents. The paper includes a set of experimental results obtained using a repository of security patterns. The results prove the viability of the proposed approach

Unitn-eprints Research