Search CORE

690 research outputs found

First Ultraviolet Outburst Detected from ASASSN-18eh Strengthens Its Interpretation as a Cataclysmic Variable

Author: Modiano D.
Verberne S.
Wijnands R.
Publication venue: 'American Astronomical Society'
Publication date: 01/12/2020
Field of study

International Migration, Integration and Social Cohesion online publications

UV follow-up observations of five recently active novae in M31

Author: Modiano D.
Verberne S.
Wijnands R.
Publication venue
Publication date: 07/07/2021
Field of study

Recently we initiated the Transient UV Objects (TUVO) project, in which we search for serendipitous UV transients in near-real time in Swift/UVOT data using a purposely-built pipeline

International Migration, Integration and Social Cohesion online publications

UvA-DARE

The user perspective in professional information search

Author: Verberne S.
Publication venue
Publication date: 01/01/2022
Field of study

Computer Systems, Imagery and Medi

Leiden University Scholary Publications

Is de zoekmachine van de toekomst een chatbot?

Author: Verberne S.
Publication venue
Publication date: 03/06/2024
Field of study

Oratie uitgesproken door Prof. Dr. Suzan Verberne bij de aanvaarding van het ambt van hoogleraar Natural Language Processing aan de Universiteit Leiden op maandag 3 juni 2024____________________________________________________________Text also in English : Is the search engine of the future a chatbot?Oratie uitgesproken door Prof. Dr. Suzan Verberne bij de aanvaarding van het ambt van hoogleraar Natural Language Processing aan de Universiteit Leiden op maandag 3 juni 2024Computer Systems, Imagery and Medi

Leiden University Scholary Publications

A Test Collection of Synthetic Documents for Training Rankers:ChatGPT vs. Human Experts

Author: Aliannejadi M.
Askari A.
Kanoulas E.
Verberne S.
Publication venue
Publication date: 01/01/2023
Field of study

We investigate the usefulness of generative large language models (LLMs) in generating training data for cross-encoder re-rankers in a novel direction: generating synthetic documents instead of synthetic queries. We introduce a new dataset, ChatGPT-RetrievalQA, and compare the effectiveness of strong models fine-tuned on both LLM-generated and human-generated data. We build ChatGPT-RetrievalQA based on an existing dataset, the human ChatGPT comparison corpus (HC3), consisting of multiple public question collections featuring both human- and ChatGPT-generated responses. We fine-tune a range of cross-encoder re-rankers on either human-generated or ChatGPT-generated data. Our evaluation on MS MARCO DEV, TREC DL'19, and TREC DL'20 demonstrates that cross-encoder re-ranking models trained on LLM-generated responses are significantly more effective for out-of-domain re-ranking than those trained on human responses. For in-domain re-ranking, however, the human-trained re-rankers outperform the LLM-trained re-rankers. Our novel findings suggest that generative LLMs have high potential in generating training data for neural retrieval models and can be used to augment training data, especially in domains with less labeled data. ChatGPT-RetrievalQA presents various opportunities for analyzing and improving rankers with both human- and LLM-generated data. Our data, code, and model checkpoints are publicly available.</p

Leiden University Scholary Publications

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Overview of the SBS 2016 Mining Track

Author: Balog K.
Bogers T.
Hendrickx I.H.E.
Koolen M.
Verberne S.
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2016
Field of study

VBN

Using skipgrams and POS-based feature selection for patent classiﬁcation

Author: Boves L.W.J.
D'hondt E.K.L.
Koster C.
Verberne S.
Weber N.
Publication venue
Publication date: 01/01/2012
Field of study

Contains fulltext : 116289.pdf (publisher's version ) (Open Access)19 p

Radboud Repository (Radboud Univ.)

Transfer Learning for Health-related Twitter Data

Author: Dirkson A.R.
Verberne S.
Publication venue
Publication date: 01/01/2019
Field of study

Algorithms and the Foundations of Software technolog

Crossref

Leiden University Scholary Publications

Citation Metrics for Legal Information Retrieval Systems

Author: Verberne S.
Wiggers G.
Publication venue
Publication date: 01/01/2019
Field of study

This paper examines citations in legal information retrieval. Citation metrics can be a factor of relevance in the ranking algorithms of legal information retrieval systems. We provide an overview of the Dutch legal publishing culture. To analyze citations in legal publications, we manually analyze a set of documents and register by what (type of) documents they are cited: document type, intended audience of documents, actual audience of documents and author affiliations. An analysis of 9 cited documents and 217 citing documents shows no strict separation in citations between documents aimed at scholars and documents aimed at practitioners. Our results suggest that citations in legal documents do not measure the impact on scholarly publications and scholars, but measure a broader scope of impact, or relevance, for the legal field.Computer Science

Leiden University Scholary Publications

CLosER: Conversational Legal Longformer with Expertise-Aware Passage Response Ranker for Long Contexts

Author: Abolghasemi A.
Aliannejadi M.
Askari A.
Kanoulas E.
Verberne S.
Publication venue
Publication date: 01/01/2023
Field of study

In this paper, we investigate the task of response ranking in conversational legal search. We propose a novel method for conversational passage response retrieval (ConvPR) for long conversations in domains with mixed levels of expertise. Conversational legal search is challenging because the domain includes long, multi-participant dialogues with domain-specific language. Furthermore, as opposed to other domains, there typically is a large knowledge gap between the questioner (a layperson) and the responders (lawyers), participating in the same conversation. We collect and release a large-scale real-world dataset called LegalConv with nearly one million legal conversations from a legal community question answering (CQA) platform. We address the particular challenges of processing legal conversations, with our novel Conversational Legal Longformer with Expertise-Aware Response Ranker, called CLosER. The proposed method has two main innovations compared to state-of-the-art methods for ConvPR: (i) Expertise-Aware Post-Training; a learning objective that takes into account the knowledge gap difference between participants to the conversation; and (ii) a simple but effective strategy for re-ordering the context utterances in long conversations to overcome the limitations of the sparse attention mechanism of the Longformer architecture. Evaluation on LegalConv shows that our proposed method substantially and significantly outperforms existing state-of-the-art models on the response selection task. Our analysis indicates that our Expertise-Aware Post-Training, i.e., continued pre-training or domain/task adaptation, plays an important role in the achieved effectiveness. Our proposed method is generalizable to other tasks with domain-specific challenges and can facilitate future research on conversational search in other domains.</p

International Migration, Integration and Social Cohesion online publications

UvA-DARE