Search CORE

97 research outputs found

Third International Workshop on Gamification for Information Retrieval (GamifIR'16)

Author: Hopfgartner Frank
Kazai Gabriella
Kruschwitz Udo
Meder Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Stronger engagement and greater participation is often crucial to reach a goal or to solve an issue. Issues like the emerging employee engagement crisis, insufficient knowledge sharing, and chronic procrastination. In many cases we need and search for tools to beat procrastination or to change people’s habits. Gamification is the approach to learn from often fun, creative and engaging games. In principle, it is about understanding games and applying game design elements in a non-gaming environments. This offers possibilities for wide area improvements. For example more accurate work, better retention rates and more cost effective solutions by relating motivations for participating as more intrinsic than conventional methods. In the context of Information Retrieval (IR) it is not hard to imagine that many tasks could benefit from gamification techniques. Besides several manual annotation tasks of data sets for IR research, user participation is important in order to gather implicit or even explicit feedback to feed the algorithms. Gamification, however, comes with its own challenges and its adoption in IR is still in its infancy. Given the enormous response to the first and second GamifIR workshops that were both co-located with ECIR, and the broad range of topics discussed, we now organized the third workshop at SIGIR 2016 to address a range of emerging challenges and opportunities

Enlighten

The accessibility dimension for structured document retrieval

Author: Kazai Gabriella
Lalmas Mounia
Quicker Stefan
Roelleke Thomas
Ruthven Ian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2001
Field of study

Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

A Personalised Reader for Crowd Curated Content

Author: Clarke Daoud
Kazai Gabriella
Venanzi Matteo
Yusof Iskander
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Personalised news recommender systems traditionally rely on content ingested from a select set of publishers and ask users to indicate their interests from a predefined list of top- ics. They then provide users a feed of news items for each of their topics. In this demo, we present a mobile app that automatically learns users’ interests from their browsing or twitter history and provides them with a personalised feed of diverse, crowd curated content. The app also continuously learns from the users’ interactions as they swipe to like or skip items recommended to them. In addition, users can discover trending stories and content liked by other users they follow. The crowd is thus formed of the users, who as a whole act as the curators of the content to be recommended

Southampton (e-Prints Soton)

Crossref

GamifIR 2016: SIGIR 2016 Workshop on Gamification for Information Retrieval

Author: Hopfgartner Frank
Kazai Gabriella
Kruschwitz Udo
Meder Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2016
Field of study

The third workshop on Gamification for Information Retrieval (GamifIR) took place on the 21th of July 2016 in conjunction with SIGIR 2016 in Pisa, Italy. It was the first GamifIR held in conjunction with the SIGIR, the first and second GamifIR workshops were both colocated with ECIR. The workshop program included one invited keynote presentation, seven paper presentations and a discussion session. The keynote presentation stated the necessity of proper theory for gamification design and resulting opportunities. The paper presentation covered studies on diverse areas and approaches for the application of gamification

University of Essex Research Repository

University of Regensburg Publication Server

Crossref

Enlighten

On the Social and Technical Challenges of Web Search Autosuggestion Moderation

Author: Diaz Fernando
Golebiewski Michael
Hazen Timothy J.
Kazai Gabriella
Olteanu Alexandra
Publication venue
Publication date: 09/07/2020
Field of study

Past research shows that users benefit from systems that support them in their writing and exploration tasks. The autosuggestion feature of Web search engines is an example of such a system: It helps users in formulating their queries by offering a list of suggestions as they type. Autosuggestions are typically generated by machine learning (ML) systems trained on a corpus of search logs and document representations. Such automated methods can become prone to issues that result in problematic suggestions that are biased, racist, sexist or in other ways inappropriate. While current search engines have become increasingly proficient at suppressing such problematic suggestions, there are still persistent issues that remain. In this paper, we reflect on past efforts and on why certain issues still linger by covering explored solutions along a prototypical pipeline for identifying, detecting, and addressing problematic autosuggestions. To showcase their complexity, we discuss several dimensions of problematic suggestions, difficult issues along the pipeline, and why our discussion applies to the increasing number of applications beyond web search that implement similar textual suggestion features. By outlining persistent social and technical challenges in moderating web search suggestions, we provide a renewed call for action.Comment: 17 Pages, 4 images displayed within 3 latex figure

arXiv.org e-Print Archive

Rethinking Semi-supervised Learning with Language Models

Author: Aletras Nikolaos
Jiao Yunlong
Kazai Gabriella
Shi Zhengxiang
Tonolini Francesco
Yilmaz Emine
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2023
Field of study

Semi-supervised learning (SSL) is a popular setting aiming to effectively utilize unlabelled data to improve model performance in downstream natural language processing (NLP) tasks. Currently, there are two popular approaches to make use of unlabelled data: Self-training (ST) and Task-adaptive pre-training (TAPT). ST uses a teacher model to assign pseudo-labels to the unlabelled data, while TAPT continues pre-training on the unlabelled data before fine-tuning. To the best of our knowledge, the effectiveness of TAPT in SSL tasks has not been systematically studied, and no previous work has directly compared TAPT and ST in terms of their ability to utilize the pool of unlabelled data. In this paper, we provide an extensive empirical study comparing five state-of-the-art ST approaches and TAPT across various NLP tasks and data sizes, including in- and out-of-domain settings. Surprisingly, we find that TAPT is a strong and more robust SSL learner, even when using just a few hundred unlabelled samples or in the presence of domain shifts, compared to more sophisticated ST approaches, and tends to bring greater improvements in SSL than in fully-supervised settings. Our further analysis demonstrates the risks of using ST approaches when the size of labelled or unlabelled data is small or when domain shifts exist. We offer a fresh perspective for future SSL research, suggesting the use of unsupervised pre-training objectives over dependency on pseudo labels

UCL Discovery

Third International Workshop on Gamification

Author: Hopfgartner Frank
Kazai Gabriella
Kruschwitz Udo
Meder Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

University of Essex Research Repository

Crossref

Enlighten

Evaluating the effectiveness of content-oriented XML retrieval methods

Author: Gabriella Kazai
Mounia Lalmas
Norbert Fuhr
Norbert Gövert
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

INEX 2006 Evaluation Measures

Author: Kamps Jaap
Kazai Gabriella
Lalmas Mounia
Pehcevski Jovan
Piwowarski Benjamin
Robertson Stephen
Publication venue: HAL CCSD
Publication date: 19/08/2007
Field of study

International audienceThis paper describes the official measures of retrieval effectiveness employed at the ad hoc track of INEX 2006

INRIA a CCSD electronic archive server

On Aggregating Labels from Multiple Crowd Workers to Infer Relevance of Documents

Author: Gabriella Kazai
Ingemar J. Cox
Mehdi Hosseini
Nataša Milić-frayling
Vishwa Vinay
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract. We consider the problem of acquiring relevance judgements for in-formation retrieval (IR) test collections through crowdsourcing when no true relevance labels are available. We collect multiple, possibly noisy relevance la-bels per document from workers of unknown labelling accuracy. We use these labels to infer the document relevance based on two methods. The first method is the commonly used majority voting (MV) which determines the document relevance based on the label that received the most votes, treating all the work-ers equally. The second is a probabilistic model that concurrently estimates the document relevance and the workers accuracy using expectation maximization (EM). We run simulations and conduct experiments with crowdsourced rele-vance labels from the INEX 2010 Book Search track to investigate the accuracy and robustness of the relevance assessments to the noisy labels. We observe the effect of the derived relevance judgments on the ranking of the search systems. Our experimental results show that the EM method outperforms the MV method in the accuracy of relevance assessments and IR systems ranking. The performance improvements are especially noticeable when the number of labels per document is small and the labels are of varied quality.

CiteSeerX

Crossref