122 research outputs found

    CLEF 2017 NewsREEL Overview: Offline and Online Evaluation of Stream-based News Recommender Systems

    Get PDF
    The CLEF NewsREEL challenge allows researchers to evaluate news recommendation algorithms both online (NewsREEL Live) and offline (News- REEL Replay). Compared with the previous year NewsREEL challenged participants with a higher volume of messages and new news portals. In the 2017 edition of the CLEF NewsREEL challenge a wide variety of new approaches have been implemented ranging from the use of existing machine learning frameworks, to ensemble methods to the use of deep neural networks. This paper gives an overview over the implemented approaches and discusses the evaluation results. In addition, the main results of Living Lab and the Replay task are explained

    NewsREEL Multimedia at MediaEval 2018: News Recommendation with Image and Text Content

    Get PDF
    NewsREEL Multimedia premiers 2018 as part of the MediaEval Benchmarking Initiative. The NewsREEL task combines recommen- dation algorithms with image and text analysis. Participants must predict the popularity of news items based on text snippets and annotated images. Several major German news portals have sup- plied data. The algorithms are evaluated in terms of Precision and Average Precision on unknown data. This paper describes the task and the provided data in detail and explains the applied evaluation approach

    A stream-based resource for multi-dimensional evaluation of recommender algorithms

    Get PDF
    Recommender System research has evolved to focus on developing algorithms capable of high performance in online systems. This development calls for a new evaluation infrastructure that supports multi-dimensional evaluation of recommender systems. Today's researchers should analyze algorithms with respect to a variety of aspects including predictive performance and scalability. Researchers need to subject algorithms to realistic conditions in online A/B tests. We introduce two resources supporting such evaluation methodologies: the new data set of stream recommendation interactions released for CLEF NewsREEL 2017, and the new Open Recommendation Platform (ORP). The data set allows researchers to study a stream recommendation problem closely by "replaying" it locally, and ORP makes it possible to take this evaluation "live" in a living lab scenario. Specifically, ORP allows researchers to deploy their algorithms in a live stream to carry out A/B tests. To our knowledge, NewsREEL is the first online news recommender system resource to be put at the disposal of the research community. In order to encourage others to develop comparable resources for a wide range of domains, we present a list of practical lessons learned in the development of the dataset and ORP

    CLEF 2017 NewsREEL overview: A stream-based recommender task for evaluation and education

    Get PDF
    News recommender systems provide users with access to news stories that they find interesting and relevant. As other online, stream-based recommender systems, they face particular challenges, including limited information on users’ preferences and also rapidly fluctuating item collections. In addition, technical aspects, such as response time and scalability, must be considered. Both algorithmic and technical considerations shape working requirements for real-world recommender systems in businesses. NewsREEL represents a unique opportunity to evaluate recommendation algorithms and for students to experience realistic conditions and to enlarge their skill sets. The NewsREEL Challenge requires participants to conduct data-driven experiments in NewsREEL Replay as well as deploy their best models into NewsREEL Live’s ‘living lab’. This paper presents NewsREEL 2017 and also provides insights into the effectiveness of NewsREEL to support the goals of instructors teaching recommender systems to students. We discuss the experiences of NewsREEL participants as well as those of instructors teaching recommender systems to students, and in this way, we showcase NewsREEL’s ability to support the education of future data scientists

    K-Space at TRECVID 2008

    Get PDF
    In this paper we describe K-Space’s participation in TRECVid 2008 in the interactive search task. For 2008 the K-Space group performed one of the largest interactive video information retrieval experiments conducted in a laboratory setting. We had three institutions participating in a multi-site multi-system experiment. In total 36 users participated, 12 each from Dublin City University (DCU, Ireland), University of Glasgow (GU, Scotland) and Centrum Wiskunde and Informatica (CWI, the Netherlands). Three user interfaces were developed, two from DCU which were also used in 2007 as well as an interface from GU. All interfaces leveraged the same search service. Using a latin squares arrangement, each user conducted 12 topics, leading in total to 6 runs per site, 18 in total. We officially submitted for evaluation 3 of these runs to NIST with an additional expert run using a 4th system. Our submitted runs performed around the median. In this paper we will present an overview of the search system utilized, the experimental setup and a preliminary analysis of our results

    Idomaar : a framework for multi-dimensional benchmarking of recommender algorithms

    Get PDF
    In real-world scenarios, recommenders face non-functional requirements of technical nature and must handle dynamic data in the form of sequential streams. Evaluation of recommender systems must take these issues into account in order to be maximally informative. In this paper, we present Idomaar—a framework that enables the efficient multi-dimensional benchmarking of recommender algorithms. Idomaar goes beyond current academic research practices by creating a realistic evaluation environment and computing both effectiveness and technical metrics for stream-based as well as set-based evaluation. A scenario focussing on “research to prototyping to productization” cycle at a company illustrates Idomaar’s potential. We show that Idomaar simplifies testing with varying configurations and supports flexible integration of different data

    Continuous evaluation of large-scale information access systems : a case for living labs

    Get PDF
    A/B testing is currently being increasingly adopted for the evaluation of commercial information access systems with a large user base since it provides the advantage of observing the efficiency and effectiveness of information access systems under real conditions. Unfortunately, unless university-based researchers closely collaborate with industry or develop their own infrastructure or user base, they cannot validate their ideas in live settings with real users. Without online testing opportunities open to the research communities, academic researchers are unable to employ online evaluation on a larger scale. This means that they do not get feedback for their ideas and cannot advance their research further. Businesses, on the other hand, miss the opportunity to have higher customer satisfaction due to improved systems. In addition, users miss the chance to benefit from an improved information access system. In this chapter, we introduce two evaluation initiatives at CLEF, NewsREEL and Living Labs for IR (LL4IR), that aim to address this growing “evaluation gap” between academia and industry. We explain the challenges and discuss the experiences organizing these living labs

    Overview of NewsREEL’16: Multi-dimensional evaluation of real-time stream-recommendation algorithms

    Get PDF
    Successful news recommendation requires facing the challenges of dynamic item sets, contextual item relevance, and of fulfilling non-functional requirements, such as response time. The CLEF NewsREEL challenge is a campaign-style evaluation lab allowing participants to tackle news recommendation and to optimize and evaluate their recommender algorithms both online and offline. In this paper, we summarize the objectives and challenges of NewsREEL 2016. We cover two contrasting perspectives on the challenge: that of the operator (the business providing recommendations) and that of the challenge participant (the researchers developing recommender algorithms). In the intersection of these perspectives, new insights can be gained on how to effectively evaluate real-time stream recommendation algorithms

    Pre-hospital prediction of adverse outcomes in patients with suspected COVID-19: Development, application and comparison of machine learning and deep learning methods

    Get PDF
    Background: COVID-19 infected millions of people and increased mortality worldwide. Patients with suspected COVID-19 utilised emergency medical services (EMS) and attended emergency departments, resulting in increased pressures and waiting times. Rapid and accurate decision-making is required to identify patients at high-risk of clinical deterioration following COVID-19 infection, whilst also avoiding unnecessary hospital admissions. Our study aimed to develop artificial intelligence models to predict adverse outcomes in suspected COVID-19 patients attended by EMS clinicians. Method: Linked ambulance service data were obtained for 7,549 adult patients with suspected COVID-19 infection attended by EMS clinicians in the Yorkshire and Humber region (England) from 18-03-2020 to 29-06-2020. We used support vector machines (SVM), extreme gradient boosting, artificial neural network (ANN) models, ensemble learning methods and logistic regression to predict the primary outcome (death or need for organ support within 30 days). Models were compared with two baselines: the decision made by EMS clinicians to convey patients to hospital, and the PRIEST clinical severity score. Results: Of the 7,549 patients attended by EMS clinicians, 1,330 (17.6%) experienced the primary outcome. Machine Learning methods showed slight improvements in sensitivity over baseline results. Further improvements were obtained using stacking ensemble methods, the best geometric mean (GM) results were obtained using SVM and ANN as base learners when maximising sensitivity and specificity. Conclusions: These methods could potentially reduce the numbers of patients conveyed to hospital without a concomitant increase in adverse outcomes. Further work is required to test the models externally and develop an automated system for use in clinical settings

    Rethinking the test collection methodology for personal self-tracking data

    Get PDF
    While vast volumes of personal data are being gathered daily by individuals, the MMM community has not really been tackling the challenge of developing novel retrieval algorithms for this data, due to the challenges of getting access to the data in the first place. While initial efforts have taken place on a small scale, it is our conjecture that a new evaluation paradigm is required in order to make progress in analysing, modeling and retrieving from personal data archives. In this position paper, we propose a new model of Evaluation-as-a-Service that re-imagines the test collection methodology for personal multimedia data in order to address the many challenges of releasing test collections of personal multimedia data
    corecore