Search CORE

15 research outputs found

A latent variable ranking model for content-based retrieval

Author: A. Bar-Hillel
D.D. Lewis
E. Bottou
G. Chechik
M. Bilenko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

34th European Conference on IR Research, ECIR 2012, Barcelona, Spain, April 1-5, 2012. ProceedingsSince their introduction, ranking SVM models [11] have become a powerful tool for training content-based retrieval systems. All we need for training a model are retrieval examples in the form of triplet constraints, i.e. examples specifying that relative to some query, a database item a should be ranked higher than database item b. These types of constraints could be obtained from feedback of users of the retrieval system. Most previous ranking models learn either a global combination of elementary similarity functions or a combination defined with respect to a single database item. Instead, we propose a “coarse to fine” ranking model where given a query we first compute a distribution over “coarse” classes and then use the linear combination that has been optimized for queries of that class. These coarse classes are hidden and need to be induced by the training algorithm. We propose a latent variable ranking model that induces both the latent classes and the weights of the linear combination for each class from ranking triplets. Our experiments over two large image datasets and a text retrieval dataset show the advantages of our model over learning a global combination as well as a combination for each test point (i.e. transductive setting). Furthermore, compared to the transductive approach our model has a clear computational advantages since it does not need to be retrained for each test query.Spanish Ministry of Science and Innovation (JCI-2009-04240)EU PASCAL2 Network of Excellence (FP7-ICT-216886

CiteSeerX

DSpace@MIT

Crossref

UPCommons. Portal del coneixement obert de la UPC

The Information Needs of Mobile Searchers: A Framework

Author: Russell-Rose Tony
Tate Tyler
Publication venue
Publication date: 01/04/2012
Field of study

The growing use of Internet-connected mobile devices demands that we reconsider search user interface design in light of the context and information needs specific to mobile users. In this paper the authors present a framework of mobile information needs, juxtaposing search motives—casual, lookup, learn, and investigate—with search types—informational, geographic, personal information management, and transactional

CiteSeerX

Goldsmiths Research Online

Podify : a podcast streaming platform with automatic logging of user behaviour for academic research

Author: Meggetto Francesco
Moshfeghi Yashar
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/07/2023
Field of study

Podcasts are spoken documents that, in recent years, have gained widespread popularity. Despite the growing research interest in this domain, conducting user studies remains challenging due to the lack of datasets that include user behaviour. In particular, there is a need for a podcast streaming platform that reduces the overhead of conducting user studies. To address these issues, in this work, we present Podify. It is the first web-based platform for podcast streaming and consumption specifically designed for research. The platform highly resembles existing streaming systems to provide users with a high level of familiarity on both desktop and mobile. A catalogue of podcast episodes can be easily created via RSS feeds. The platform also offers Elasticsearch-based indexing and search that is highly customisable, allowing research and experimentation in podcast search. Users can manually curate playlists of podcast episodes for consumption. With mechanisms to collect explicit feedback from users (i.e., liking and disliking behaviour), Podify also automatically collects implicit feedback (i.e., all user interactions). Users' behaviour can be easily exported to a readable format for subsequent experimental analysis. A demonstration of the platform is available at https://youtu.be/k9Z5w_KKHr8, with the code and documentation available at https://github.com/NeuraSearch/Podify

University of Strathclyde Institutional Repository

Fusion-based Methods for result diversification in web search

Author: Crestani Fabio
Huang Chunlan
Li Liang
Wu Shengli
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Crossref

Ulster University's Research Portal

E-Learning Courses Evaluation on the Basis of Trainees' Feedback on Open Questions Text Analysis

Author: Christopoulos Athanasios
Hatzilygeroudis Ioannis
Mystakidis Stylianos
Tsimaras Dimitrios O
Zoulias Emmanouil
Publication venue: 'MDPI AG'
Publication date: 29/11/2022
Field of study

Life-long learning is a necessity associated with the requirements of the fourth industrial revolution. Although distance online education played a major role in the evolution of the modern education system, this share grew dramatically because of the COVID-19 pandemic outbreak and the social distancing measures that were imposed. However, the quick and extensive adoption of online learning tools also highlighted the multidimensional weaknesses of online education and the needs that arise when considering such practices. To this end, the ease of collecting digital data, as well as the overall evolution of data analytics, enables researchers, and by extension educators, to systematically evaluate the pros and cons of such systems. For instance, advanced data mining methods can be used to find potential areas of concern or to confirm elements of excellence. In this work, we used text analysis methods on data that have emerged from participants' feedback in online lifelong learning programmes for professional development. We analysed 1890 Greek text-based answers of participants to open evaluation questions using standard text analysis processes. We finally produced 7-gram tokens from the words in the texts, from which we constructed meaningful sentences and characterized them as positive or negative. We introduced a new metric, called acceptance grade, to quantitatively evaluate them as far as their positive or negative content for the online courses is concerned. We finally based our evaluation on the top 10 sentences of each category (positive, negative). Validation of the results via two external experts and data triangulation showed an accuracy of 80%

UTUPub

Retrieval Enhancements for Task-Based Web Search

Author: Völske Michael
Publication venue
Publication date: 08/07/2019
Field of study

The task-based view of web search implies that retrieval should take the user perspective into account. Going beyond merely retrieving the most relevant result set for the current query, the retrieval system should aim to surface results that are actually useful to the task that motivated the query. This dissertation explores how retrieval systems can better understand and support their users’ tasks from three main angles: First, we study and quantify search engine user behavior during complex writing tasks, and how task success and behavior are associated in such settings. Second, we investigate search engine queries formulated as questions, and explore patterns in a large query log that may help search engines to better support this increasingly prevalent interaction pattern. Third, we propose a novel approach to reranking the search result lists produced by web search engines, taking into account retrieval axioms that formally specify properties of a good ranking.Die Task-basierte Sicht auf Websuche impliziert, dass die Benutzerperspektive berücksichtigt werden sollte. Über das bloße Abrufen der relevantesten Ergebnismenge für die aktuelle Anfrage hinaus, sollten Suchmaschinen Ergebnisse liefern, die tatsächlich für die Aufgabe (Task) nützlich sind, die diese Anfrage motiviert hat. Diese Dissertation untersucht, wie Retrieval-Systeme die Aufgaben ihrer Benutzer besser verstehen und unterstützen können, und leistet Forschungsbeiträge unter drei Hauptaspekten: Erstens untersuchen und quantifizieren wir das Verhalten von Suchmaschinenbenutzern während komplexer Schreibaufgaben, und wie Aufgabenerfolg und Verhalten in solchen Situationen zusammenhängen. Zweitens untersuchen wir Suchmaschinenanfragen, die als Fragen formuliert sind, und untersuchen ein Suchmaschinenlog mit fast einer Milliarde solcher Anfragen auf Muster, die Suchmaschinen dabei helfen können, diesen zunehmend verbreiteten Anfragentyp besser zu unterstützen. Drittens schlagen wir einen neuen Ansatz vor, um die von Web-Suchmaschinen erstellten Suchergebnislisten neu zu sortieren, wobei Retrieval-Axiome berücksichtigt werden, die die Eigenschaften eines guten Rankings formal beschreiben

Online-Publikationssystem der Bauhaus-Universität Weimar

Cheap IR Evaluation: Fewer Topics, No Relevance Judgements, and Crowdsourced Assessments

Author: Roitero Kevin
Publication venue: Universit\ue0 degli Studi di Udine
Publication date: 19/03/2020
Field of study

To evaluate Information Retrieval (IR) effectiveness, a possible approach is to use test collections, which are composed of a collection of documents, a set of description of information needs (called topics), and a set of relevant documents to each topic. Test collections are modelled in a competition scenario: for example, in the well known TREC initiative, participants run their own retrieval systems over a set of topics and they provide a ranked list of retrieved documents; some of the retrieved documents (usually the first ranked) constitute the so called pool, and their relevance is evaluated by human assessors; the document list is then used to compute effectiveness metrics and rank the participant systems. Private Web Search companies also run their in-house evaluation exercises; although the details are mostly unknown, and the aims are somehow different, the overall approach shares several issues with the test collection approach. The aim of this work is to: (i) develop and improve some state-of-the-art work on the evaluation of IR effectiveness while saving resources, and (ii) propose a novel, more principled and engineered, overall approach to test collection based effectiveness evaluation. [...

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università degli Studi di Udine

Relevance-based language models : new estimations and applications

Author: Parapar Javier
Publication venue
Publication date: 01/01/2013
Field of study

[Abstratc] Relevance-Based Language Models introduced in the Language Modelling framework the concept of relevance, which is explicit in other retrieval models such as the Probabilistic models. Relevance Models have been mainly used for a specific task within Information Retrieval called Pseudo-Relevance Feedback, a kind of local query expansion technique where relevance is assumed over a top of documents from the initial retrieval and where those documents are used to select expansion terms for the original query and produce a, hopefully more effective, second retrieval. In this thesis we investigate some new estimations for Relevance Models for both Pseudo-Relevance Feedback and other tasks beyond retrieval, particularly, constrained text clustering and item recommendation in Recommender Systems. We study the benefits of our proposals for those tasks in comparison with existing estimations. This new modellings are able not only to improve the effectiveness of the existing estimations and methods but also to outperform their robustness, a critical factor when dealing with Pseudo-Relevance Feedback methods. These objectives are pursued by different means: promoting divergent terms in the estimation of the Relevance Models, presenting new cluster-based retrieval models, introducing new methods for automatically determine the size of the pseudo-relevant set on a query-basis, and originally producing new modellings under the Relevance-Based Language Modelling framework for the constrained text clustering and the item recommendation problems

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas